[ceph-users] OSD is near full and slow in accessing storage from client

David Turner drakonstein at gmail.com
Sun Nov 12 10:57:47 PST 2017


You cannot reduce the PG count for a pool.  So there isn't anything you can
really do for this unless you create a new FS with better PG counts and
migrate your data into it.

The problem with having more PGs than you need is in the memory footprint
for the osd daemon. There are warning thresholds for having too many PGs
per osd.  Also in future expansions, if you need to add pools, you might
not be able to create the pools with the proper amount of PGs due to older
pools that have way too many PGs.

It would still be nice to see the output from those commands I asked about.

The built-in reweighting scripts might help your data distribution.
reweight-by-utilization

On Sun, Nov 12, 2017, 11:41 AM gjprabu <gjprabu at zohocorp.com> wrote:

> Hi David,
>
> Thanks for your valuable reply , once complete the backfilling for new osd
> and will consider by increasing replica value asap. Is it possible to
> decrease the metadata pg count ?  if the pg count for metadata for value
> same as data count what kind of issue may occur ?
>
> Regards
> PrabuGJ
>
>
>
> ---- On Sun, 12 Nov 2017 21:25:05 +0530 David Turner<drakonstein at gmail.com>
> wrote ----
>
> What's the output of `ceph df` to see if your PG counts are good or not?
> Like everyone else has said, the space on the original osds can't be
> expected to free up until the backfill from adding the new osd has finished.
>
> You don't have anything in your cluster health to indicate that your
> cluster will not be able to finish this backfilling operation on its own.
>
> You might find this URL helpful in calculating your PG counts.
> http://ceph.com/pgcalc/  As a side note. It is generally better to keep
> your PG counts as base 2 numbers (16, 64, 256, etc). When you do not have a
> base 2 number then some of your PGs will take up twice as much space as
> others. In your case with 250, you have 244 PGs that are the same size and
> 6 PGs that are twice the size of those 244 PGs.  Bumping that up to 256
> will even things out.
>
> Assuming that the metadata pool is for a CephFS volume, you do not need
> nearly so many PGs for that pool. Also, I would recommend changing at least
> the metadata pool to 3 replica_size. If we can talk you into 3 replica for
> everything else, great! But if not, at least do the metadata pool. If you
> lose an object in the data pool, you just lose that file. If you lose an
> object in the metadata pool, you might lose access to the entire CephFS
> volume.
>
> On Sun, Nov 12, 2017, 9:39 AM gjprabu <gjprabu at zohocorp.com> wrote:
>
> Hi Cassiano,
>
>        Thanks for your valuable feedback and will wait for some time till
> new osd sync get complete. Also for by increasing pg count it is the issue
> will solve? our setup pool size for data and metadata pg number is 250. Is
> this correct for 7 OSD with 2 replica. Also currently stored data size is
> 17TB.
>
> ceph osd df
> ID WEIGHT  REWEIGHT SIZE   USE    AVAIL %USE  VAR  PGS
> 0 3.29749  1.00000  3376G  2814G  562G 83.35 1.23 165
> 1 3.26869  1.00000  3347G  1923G 1423G 57.48 0.85 152
> 2 3.27339  1.00000  3351G  1980G 1371G 59.10 0.88 161
> 3 3.24089  1.00000  3318G  2131G 1187G 64.23 0.95 168
> 4 3.24089  1.00000  3318G  2998G  319G 90.36 1.34 176
> 5 3.32669  1.00000  3406G  2476G  930G 72.68 1.08 165
> 6 3.27800  1.00000  3356G  1518G 1838G 45.24 0.67 166
>               TOTAL 23476G 15843G 7632G 67.49
> MIN/MAX VAR: 0.67/1.34  STDDEV: 14.53
>
> ceph osd tree
> ID WEIGHT   TYPE NAME            UP/DOWN REWEIGHT PRIMARY-AFFINITY
> -1 22.92604 root default
> -2  3.29749     host intcfs-osd1
> 0  3.29749         osd.0             up  1.00000          1.00000
> -3  3.26869     host intcfs-osd2
> 1  3.26869         osd.1             up  1.00000          1.00000
> -4  3.27339     host intcfs-osd3
> 2  3.27339         osd.2             up  1.00000          1.00000
> -5  3.24089     host intcfs-osd4
> 3  3.24089         osd.3             up  1.00000          1.00000
> -6  3.24089     host intcfs-osd5
> 4  3.24089         osd.4             up  1.00000          1.00000
> -7  3.32669     host intcfs-osd6
> 5  3.32669         osd.5             up  1.00000          1.00000
> -8  3.27800     host intcfs-osd7
> 6  3.27800         osd.6             up  1.00000          1.00000
>
> *ceph osd pool ls detail*
>
> pool 0 'rbd' replicated size 2 min_size 1 crush_ruleset 0 object_hash
> rjenkins pg_num 64 pgp_num 64 last_change 1 flags hashpspool stripe_width 0
> pool 3 '*downloads_data*' replicated size 2 min_size 1 crush_ruleset 0
> object_hash rjenkins* pg_num 250 pgp_num 250* last_change 39 flags
> hashpspool crash_replay_interval 45 stripe_width 0
> pool 4 '*downloads_metadata*' replicated size 2 min_size 1 crush_ruleset
> 0 object_hash rjenkins *pg_num 250 pgp_num 250 *last_change 36 flags
> hashpspool stripe_width 0
>
> Regards
> Prabu GJ
>
> ---- On Sun, 12 Nov 2017 19:20:34 +0530 *Cassiano Pilipavicius
> <cassiano at tips.com.br <cassiano at tips.com.br>>* wrote ----
>
> I am also not an expert, but it looks like you have big data volumes on
> few PGs, from what I've seen, the pg data is only deleted from the old OSD
> when is completed copied to the new osd.
>
> So, if 1 pg have 100G por example, only when it is fully copied to the new
> OSD, the space will be released on the old OSD.
>
> If you have a busy cluster/network, it may take a good while. Maybe just
> wait a litle and check from time to time and the space will eventually be
> released.
>
> Em 11/12/2017 11:44 AM, Sébastien VIGNERON escreveu:
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users at lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
> I’m not an expert either so if someone in the list have some ideas on this
> problem, don’t be shy, share them with us.
>
> For now, I only have hypothese that the OSD space will be recovered as
> soon as the recovery process is complete.
> Hope everything will get back in order soon (before reaching 95% or above).
>
> I saw some messages on the list about the fstrim tool which can help
> reclaim unused free space, but i don’t know if it’s apply to your case.
>
> Cordialement / Best regards,
>
> Sébastien VIGNERON
> CRIANN,
> Ingénieur / Engineer
> Technopôle du Madrillet
> 745, avenue de l'Université
> <https://maps.google.com/?q=745,+avenue+de+l'Universit%C3%A9%C2%A0+76800+Saint-Etienne+du+Rouvray+-+France&entry=gmail&source=g>
>
> 76800 Saint-Etienne du Rouvray - France
> <https://maps.google.com/?q=745,+avenue+de+l'Universit%C3%A9%C2%A0+76800+Saint-Etienne+du+Rouvray+-+France&entry=gmail&source=g>
>
> tél. +33 2 32 91 42 91
> fax. +33 2 32 91 42 92
> http://www.criann.fr
> mailto:sebastien.vigneron at criann.fr <sebastien.vigneron at criann.fr>
> support: support at criann.fr
>
> Le 12 nov. 2017 à 13:29, gjprabu <gjprabu at zohocorp.com> a écrit :
>
> Hi Sebastien,
>
>     Below is the query details. I am not that much expert and still
> learning . pg's are not stuck stat before adding osd and pg are slowly
> clearing stat to active-clean. Today morning there was around
> 53 active+undersized+degraded+remapped+wait_backfill and now it is 21 only,
> hope its going on and i am seeing the space keep increasing in newly added
> OSD (osd.6)
>
>
> ID WEIGHT  REWEIGHT SIZE   USE    AVAIL %USE  VAR  PGS
> *0 3.29749  1.00000  3376G  2814G  562G 83.35 1.23 165  ( Available Spaces
> not reduced after adding new OSD)*
> 1 3.26869  1.00000  3347G  1923G 1423G 57.48 0.85 152
> 2 3.27339  1.00000  3351G  1980G 1371G 59.10 0.88 161
> 3 3.24089  1.00000  3318G  2131G 1187G 64.23 0.95 168
> *4 3.24089  1.00000  3318G  2998G  319G 90.36 1.34 176  ( Available Spaces
> not reduced after adding new OSD)*
> *5 3.32669  1.00000  3406G  2476G  930G 72.68 1.08 165  ( Available Spaces
> not reduced after adding new OSD)*
> 6 3.27800  1.00000  3356G  1518G 1838G 45.24 0.67 166
>               TOTAL 23476G 15843G 7632G 67.49
> MIN/MAX VAR: 0.67/1.34  STDDEV: 14.53
>
> ...
>
>
>
> _______________________________________________ ceph-users mailing list ceph-users at lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
> _______________________________________________
> ceph-users mailing list
> ceph-users at lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20171112/bc7212a9/attachment.html>


More information about the ceph-users mailing list