[ceph-users] how to upgrade CEPH journal?

Richard Hesketh richard.hesketh at rd.bbc.co.uk
Thu Nov 9 08:38:47 PST 2017


Please bear in mind that unless you've got a very good reason for separating the WAL/DB into two partitions (i.e. you are testing/debugging and want to observe their behaviour separately or they're actually going to go on different devices which have different speeds) you should probably stick to using one large partition and specifying block.db only, the WAL will automatically be included with the DB.

Personally, I found specifying these options in the config overly fiddly; if you manually partition your DB device with gdisk or whatever, and then specify the partitions as arguments, ceph-disk will just use that entire partition regardless of what other size settings are configured.

ceph-disk prepare --bluestore /dev/sda --block.db /dev/disk/by-partuuid/[UUID STRING]

(you should refer to the partition by UUID, rather than device letter, because ceph-disk will just symlink to the provided argument, and you cannot guarantee that device letters will be consistent between reboots, but you can be pretty dang sure the UUID will not change or collide)

Rich

On 09/11/17 16:26, Caspar Smit wrote:
> Rudi,
> 
> You can set the size of block.db and block.wal partitions in the ceph.conf configuration file using:
> 
> bluestore_block_db_size = 16106127360 (which is 15GB, just calculate the correct number for your needs)
> bluestore_block_wal_size = 16106127360
> 
> Kind regards,
> Caspar
> 
> 
> 2017-11-09 17:19 GMT+01:00 Rudi Ahlers <rudiahlers at gmail.com <mailto:rudiahlers at gmail.com>>:
> 
>     Hi Alwin, 
> 
>     Thanx for the help. 
> 
>     I see now that I used the wrong wording in my email. I want to resize the journal, not upgrade.
> 
>     So, following your commands, I still sit with a 1GB journal:
> 
> 
> 
>     oot at virt1:~# ceph-disk prepare --bluestore \
>     > --block.db /dev/sde --block.wal /dev/sde1 /dev/sda
>     Setting name!
>     partNum is 0
>     REALLY setting name!
>     The operation has completed successfully.
>     prepare_device: OSD will not be hot-swappable if block.db is not the same device as the osd data
>     Setting name!
>     partNum is 1
>     REALLY setting name!
>     The operation has completed successfully.
>     The operation has completed successfully.
>     prepare_device: OSD will not be hot-swappable if block.wal is not the same device as the osd data
>     prepare_device: Block.wal /dev/sde1 was not prepared with ceph-disk. Symlinking directly.
>     Setting name!
>     partNum is 1
>     REALLY setting name!
>     The operation has completed successfully.
>     The operation has completed successfully.
>     meta-data=/dev/sda1              isize=2048   agcount=4, agsize=6400 blks
>              =                       sectsz=4096  attr=2, projid32bit=1
>              =                       crc=1        finobt=1, sparse=0, rmapbt=0, reflink=0
>     data     =                       bsize=4096   blocks=25600, imaxpct=25
>              =                       sunit=0      swidth=0 blks
>     naming   =version 2              bsize=4096   ascii-ci=0 ftype=1
>     log      =internal log           bsize=4096   blocks=1608, version=2
>              =                       sectsz=4096  sunit=1 blks, lazy-count=1
>     realtime =none                   extsz=4096   blocks=0, rtextents=0
>     Warning: The kernel is still using the old partition table.
>     The new table will be used at the next reboot or after you
>     run partprobe(8) or kpartx(8)
>     The operation has completed successfully.
> 
>     root at virt1:~# partprobe
> 
> 
>     root at virt1:~# fdisk -l | grep sde
>     Disk /dev/sde: 372.6 GiB, 400088457216 bytes, 781422768 sectors
>     /dev/sde1       2048 195311615 195309568 93.1G Linux filesystem
>     /dev/sde2  195311616 197408767   2097152    1G unknown
> 
> 
> 
>     On Thu, Nov 9, 2017 at 6:02 PM, Alwin Antreich <a.antreich at proxmox.com <mailto:a.antreich at proxmox.com>> wrote:
> 
>         Hi Rudi,
>         On Thu, Nov 09, 2017 at 04:09:04PM +0200, Rudi Ahlers wrote:
>         > Hi,
>         >
>         > Can someone please tell me what the correct procedure is to upgrade a CEPH
>         > journal?
>         >
>         > I'm running ceph: 12.2.1 on Proxmox 5.1, which runs on Debian 9.1
>         >
>         > For a journal I have a 400GB Intel SSD drive and it seems CEPH created a
>         > 1GB journal:
>         >
>         > Disk /dev/sdf: 372.6 GiB, 400088457216 bytes, 781422768 sectors
>         > /dev/sdf1     2048 2099199 2097152   1G unknown
>         > /dev/sdf2  2099200 4196351 2097152   1G unknown
>         >
>         > root at virt2:~# fdisk -l | grep sde
>         > Disk /dev/sde: 372.6 GiB, 400088457216 bytes, 781422768 sectors
>         > /dev/sde1   2048 2099199 2097152   1G unknown
>         >
>         >
>         > /dev/sda :
>         >  /dev/sda1 ceph data, active, cluster ceph, osd.3, block /dev/sda2,
>         > block.db /dev/sde1
>         >  /dev/sda2 ceph block, for /dev/sda1
>         > /dev/sdb :
>         >  /dev/sdb1 ceph data, active, cluster ceph, osd.4, block /dev/sdb2,
>         > block.db /dev/sdf1
>         >  /dev/sdb2 ceph block, for /dev/sdb1
>         > /dev/sdc :
>         >  /dev/sdc1 ceph data, active, cluster ceph, osd.5, block /dev/sdc2,
>         > block.db /dev/sdf2
>         >  /dev/sdc2 ceph block, for /dev/sdc1
>         > /dev/sdd :
>         >  /dev/sdd1 other, xfs, mounted on /data/brick1
>         >  /dev/sdd2 other, xfs, mounted on /data/brick2
>         > /dev/sde :
>         >  /dev/sde1 ceph block.db, for /dev/sda1
>         > /dev/sdf :
>         >  /dev/sdf1 ceph block.db, for /dev/sdb1
>         >  /dev/sdf2 ceph block.db, for /dev/sdc1
>         > /dev/sdg :
>         >
>         >
>         > resizing the partition through fdisk didn't work. What is the correct
>         > procedure, please?
>         >
>         > Kind Regards
>         > Rudi Ahlers
>         > Website: http://www.rudiahlers.co.za
> 
>         > _______________________________________________
>         > ceph-users mailing list
>         > ceph-users at lists.ceph.com <mailto:ceph-users at lists.ceph.com>
>         > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
>         For Bluestore OSDs you need to set bluestore_block_size to geat a bigger
>         partition for the DB and bluestore_block_wal_size for the WAL.
> 
>         ceph-disk prepare --bluestore \
>         --block.db /dev/sde --block.wal /dev/sde /dev/sdX
> 
>         This gives you in total four partitions on two different disks.
> 
>         I think it will be less hassle to remove the OSD and prepare it again.
> 
>         --
>         Cheers,
>         Alwin
> 
>         _______________________________________________
>         ceph-users mailing list
>         ceph-users at lists.ceph.com <mailto:ceph-users at lists.ceph.com>
>         http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
> 
> 
> 
> 
>     -- 
>     Kind Regards
>     Rudi Ahlers
>     Website: http://www.rudiahlers.co.za
> 
>     _______________________________________________
>     ceph-users mailing list
>     ceph-users at lists.ceph.com <mailto:ceph-users at lists.ceph.com>
>     http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
> 
> 
> 
> 
> _______________________________________________
> ceph-users mailing list
> ceph-users at lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 


-- 
Richard Hesketh
Systems Engineer, Research Platforms
BBC Research & Development

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20171109/0764556c/attachment.sig>


More information about the ceph-users mailing list