[ceph-users] cephfs-journal-tool event recover_dentries summary killed due to memory usage

Rhian Resnick xantho at sepiidae.com
Sat Nov 3 06:45:25 PDT 2018


Having attempted to recover using the journal tool and having that fail we
are goinig to rebuild our metadata using a separate metadata pool.

We have the following procedure we are going to use. The issue I haven't
found yet (likely lack of sleep) is how to replace the original metadata
pool in the cephfs so we can continue to use the default name. Then how we
remove the secondary file system.

# ceph fs

ceph fs flag set enable_multiple true --yes-i-really-mean-it
ceph osd pool create recovery 512 replicated replicated_ruleset
ceph fs new recovery-fs recovery cephfs-cold
--allow-dangerous-metadata-overlay
cephfs-data-scan init --force-init --filesystem recovery-fs
--alternate-pool recovery
ceph fs reset recovery-fs --yes-i-really-mean-it


# create structure
cephfs-table-tool recovery-fs:all reset session
cephfs-table-tool recovery-fs:all reset snap
cephfs-table-tool recovery-fs:all reset inode

# build new metadata

# scan_extents

cephfs-data-scan scan_extents --alternate-pool recovery --worker_n 0
--worker_m 4 --filesystem cephfs cephfs-cold
cephfs-data-scan scan_extents --alternate-pool recovery --worker_n 1
--worker_m 4 --filesystem cephfs cephfs-cold
cephfs-data-scan scan_extents --alternate-pool recovery --worker_n 2
--worker_m 4 --filesystem cephfs cephfs-cold
cephfs-data-scan scan_extents --alternate-pool recovery --worker_n 3
--worker_m 4 --filesystem cephfs cephfs-cold

# scan inodes
cephfs-data-scan scan_inodes --alternate-pool recovery --worker_n 0
--worker_m 4 --filesystem cephfs --force-corrupt --force-init cephfs-cold
cephfs-data-scan scan_inodes --alternate-pool recovery --worker_n 0
--worker_m 4 --filesystem cephfs --force-corrupt --force-init cephfs-cold
cephfs-data-scan scan_inodes --alternate-pool recovery --worker_n 0
--worker_m 4 --filesystem cephfs --force-corrupt --force-init cephfs-cold
cephfs-data-scan scan_inodes --alternate-pool recovery --worker_n 0
--worker_m 4 --filesystem cephfs --force-corrupt --force-init cephfs-cold

cephfs-data-scan scan_links --filesystem recovery-fs

# need help

Thanks

Rhian

On Fri, Nov 2, 2018 at 9:47 PM Rhian Resnick <xantho at sepiidae.com> wrote:

> I was posting with my office account but I think it is being blocked.
>
> Our cephfs's metadata pool went from 1GB to 1TB in a matter of hours and
> after using all storage on the OSD's reports two damaged ranks.
>
> The cephfs-journal-tool crashes when performing any operations due to
> memory utilization.
>
> We tried a backup which crashed (we then did a rados cppool to backup our
> metadata).
> I then tried to run a dentry recovery which failed due to memory usage.
>
> Any recommendations for the next step?
>
> Data from our config and status
>
>
>
>
> Combined logs (after marking things as repaired to see if that would rescue us):
>
>
> Nov  1 10:07:02 ceph-p-mds2 ceph-mds: 2018-11-01 10:07:02.045499 7f68db7a3700 -1 mds.4.purge_queue operator(): Error -108 loading Journaler
> Nov  1 10:07:02 ceph-p-mds2 ceph-mds: 2018-11-01 10:07:02.045499 7f68db7a3700 -1 mds.4.purge_queue operator(): Error -108 loading Journaler
> Nov  1 10:26:40 ceph-p-mon2 ceph-mon: 2018-11-01 10:26:40.968143 7fa3b57ce700 -1 log_channel(cluster) log [ERR] : Health check update: 1 mds daemon damaged (MDS_DAMAGE)
> Nov  1 10:26:40 ceph-p-mon2 ceph-mon: 2018-11-01 10:26:40.968143 7fa3b57ce700 -1 log_channel(cluster) log [ERR] : Health check update: 1 mds daemon damaged (MDS_DAMAGE)
> Nov  1 10:26:47 ceph-storage2 ceph-mds: 2018-11-01 10:26:47.914934 7f6dacd69700 -1 mds.1.journaler.mdlog(ro) try_read_entry: decode error from _is_readable
> Nov  1 10:26:47 ceph-storage2 ceph-mds: mds.1 10.141.255.202:6898/1492854021 1 : Error loading MDS rank 1: (22) Invalid argument
> Nov  1 10:26:47 ceph-storage2 ceph-mds: 2018-11-01 10:26:47.914949 7f6dacd69700  0 mds.1.log _replay journaler got error -22, aborting
> Nov  1 10:26:47 ceph-storage2 ceph-mds: 2018-11-01 10:26:47.914934 7f6dacd69700 -1 mds.1.journaler.mdlog(ro) try_read_entry: decode error from _is_readable
> Nov  1 10:26:47 ceph-storage2 ceph-mds: 2018-11-01 10:26:47.915745 7f6dacd69700 -1 log_channel(cluster) log [ERR] : Error loading MDS rank 1: (22) Invalid argument
> Nov  1 10:26:47 ceph-storage2 ceph-mds: 2018-11-01 10:26:47.915745 7f6dacd69700 -1 log_channel(cluster) log [ERR] : Error loading MDS rank 1: (22) Invalid argument
> Nov  1 10:26:47 ceph-p-mon2 ceph-mon: 2018-11-01 10:26:47.999432 7fa3b57ce700 -1 log_channel(cluster) log [ERR] : Health check update: 2 mds daemons damaged (MDS_DAMAGE)
> Nov  1 10:26:47 ceph-p-mon2 ceph-mon: 2018-11-01 10:26:47.999432 7fa3b57ce700 -1 log_channel(cluster) log [ERR] : Health check update: 2 mds daemons damaged (MDS_DAMAGE)
> Nov  1 10:26:55 ceph-p-mon2 ceph-mon: 2018-11-01 10:26:55.026231 7fa3b57ce700 -1 log_channel(cluster) log [ERR] : Health check update: 1 mds daemon damaged (MDS_DAMAGE)
> Nov  1 10:26:55 ceph-p-mon2 ceph-mon: 2018-11-01 10:26:55.026231 7fa3b57ce700 -1 log_channel(cluster) log [ERR] : Health check update: 1 mds daemon damaged (MDS_DAMAGE)
>
> Ceph OSD Status: (The missing and oud osd's are in a different pool  from all data, these were the bad ssds that caused the issue)
>
>
>   cluster:
>     id:     6a2e8f21-bca2-492b-8869-eecc995216cc
>     health: HEALTH_ERR
>             1 filesystem is degraded
>             2 mds daemons damaged
>
>   services:
>     mon: 3 daemons, quorum ceph-p-mon2,ceph-p-mon1,ceph-p-mon3
>     mgr: ceph-p-mon1(active), standbys: ceph-p-mon2
>     mds: cephfs-3/5/5 up  {0=ceph-storage3=up:resolve,2=ceph-p-mon3=up:resolve,4=ceph-p-mds1=up:resolve}, 3 up:standby, 2 damaged
>     osd: 170 osds: 167 up, 158 in
>
>   data:
>     pools:   7 pools, 7520 pgs
>     objects: 188.46M objects, 161TiB
>     usage:   275TiB used, 283TiB / 558TiB avail
>     pgs:     7511 active+clean
>              9    active+clean+scrubbing+deep
>
>   io:
>     client:   0B/s rd, 17.2KiB/s wr, 0op/s rd, 1op/s wr
>
>
>
> Ceph OSD Tree:
>
> ID  CLASS WEIGHT    TYPE NAME                  STATUS REWEIGHT PRI-AFF
> -10               0 root deefault
>  -9         5.53958 root ssds
> -11         1.89296     host ceph-cache1
>  35   hdd   1.09109         osd.35                 up        0 1.00000
> 181   hdd   0.26729         osd.181                up        0 1.00000
> 182   hdd   0.26729         osd.182              down        0 1.00000
> 183   hdd   0.26729         osd.183              down        0 1.00000
> -12         1.75366     host ceph-cache2
>  46   hdd   1.09109         osd.46                 up        0 1.00000
> 185   hdd   0.26729         osd.185              down        0 1.00000
> 186   hdd   0.12799         osd.186                up        0 1.00000
> 187   hdd   0.26729         osd.187                up        0 1.00000
> -13         1.89296     host ceph-cache3
>  60   hdd   1.09109         osd.60                 up        0 1.00000
> 189   hdd   0.26729         osd.189                up        0 1.00000
> 190   hdd   0.26729         osd.190                up        0 1.00000
> 191   hdd   0.26729         osd.191                up        0 1.00000
>  -5         4.33493 root ssds-ro
>  -6         1.44498     host ceph-storage1-ssd
>  85   ssd   0.72249         osd.85                 up  1.00000 1.00000
>  89   ssd   0.72249         osd.89                 up  1.00000 1.00000
>  -7         1.44498     host ceph-storage2-ssd
>   5   ssd   0.72249         osd.5                  up  1.00000 1.00000
>  68   ssd   0.72249         osd.68                 up  1.00000 1.00000
>  -8         1.44498     host ceph-storage3-ssd
> 160   ssd   0.72249         osd.160                up  1.00000 1.00000
> 163   ssd   0.72249         osd.163                up  1.00000 1.00000
>  -1       552.07568 root default
>  -2       177.96744     host ceph-storage1
>   0   hdd   3.63199         osd.0                  up  1.00000 1.00000
>   1   hdd   3.63199         osd.1                  up  1.00000 1.00000
>   3   hdd   3.63199         osd.3                  up  1.00000 1.00000
>   4   hdd   3.63199         osd.4                  up  1.00000 1.00000
>   6   hdd   3.63199         osd.6                  up  1.00000 1.00000
>   8   hdd   3.63199         osd.8                  up  1.00000 1.00000
>  11   hdd   3.63199         osd.11                 up  1.00000 1.00000
>  13   hdd   3.63199         osd.13                 up  1.00000 1.00000
>  15   hdd   3.63199         osd.15                 up  1.00000 1.00000
>  18   hdd   3.63199         osd.18                 up  1.00000 1.00000
>  20   hdd   3.63199         osd.20                 up  1.00000 1.00000
>  22   hdd   3.63199         osd.22                 up  1.00000 1.00000
>  25   hdd   3.63199         osd.25                 up  1.00000 1.00000
>  27   hdd   3.63199         osd.27                 up  1.00000 1.00000
>  29   hdd   3.63199         osd.29                 up  1.00000 1.00000
>  32   hdd   3.63199         osd.32                 up  1.00000 1.00000
>  34   hdd   3.63199         osd.34                 up  1.00000 1.00000
>  36   hdd   3.63199         osd.36                 up  1.00000 1.00000
>  39   hdd   3.63199         osd.39                 up  1.00000 1.00000
>  41   hdd   3.63199         osd.41                 up  1.00000 1.00000
>  43   hdd   3.63199         osd.43                 up  1.00000 1.00000
>  48   hdd   3.63199         osd.48                 up  1.00000 1.00000
>  50   hdd   3.63199         osd.50                 up  1.00000 1.00000
>  52   hdd   3.63199         osd.52                 up  1.00000 1.00000
>  55   hdd   3.63199         osd.55                 up  1.00000 1.00000
>  62   hdd   3.63199         osd.62                 up  1.00000 1.00000
>  65   hdd   3.63199         osd.65                 up  1.00000 1.00000
>  66   hdd   3.63199         osd.66                 up  1.00000 1.00000
>  67   hdd   3.63199         osd.67                 up  1.00000 1.00000
>  70   hdd   3.63199         osd.70                 up  1.00000 1.00000
>  72   hdd   3.63199         osd.72                 up  1.00000 1.00000
>  74   hdd   3.63199         osd.74                 up  1.00000 1.00000
>  76   hdd   3.63199         osd.76                 up  1.00000 1.00000
>  79   hdd   3.63199         osd.79                 up  1.00000 1.00000
>  92   hdd   3.63199         osd.92                 up  1.00000 1.00000
>  94   hdd   3.63199         osd.94                 up  1.00000 1.00000
>  97   hdd   3.63199         osd.97                 up  1.00000 1.00000
>  99   hdd   3.63199         osd.99                 up  1.00000 1.00000
> 101   hdd   3.63199         osd.101                up  1.00000 1.00000
> 104   hdd   3.63199         osd.104                up  1.00000 1.00000
> 107   hdd   3.63199         osd.107                up  1.00000 1.00000
> 111   hdd   3.63199         osd.111                up  1.00000 1.00000
> 112   hdd   3.63199         osd.112                up  1.00000 1.00000
> 114   hdd   3.63199         osd.114                up  1.00000 1.00000
> 117   hdd   3.63199         osd.117                up  1.00000 1.00000
> 119   hdd   3.63199         osd.119                up  1.00000 1.00000
> 131   hdd   3.63199         osd.131                up  1.00000 1.00000
> 137   hdd   3.63199         osd.137                up  1.00000 1.00000
> 139   hdd   3.63199         osd.139                up  1.00000 1.00000
>  -4       177.96744     host ceph-storage2
>   7   hdd   3.63199         osd.7                  up  1.00000 1.00000
>  10   hdd   3.63199         osd.10                 up  1.00000 1.00000
>  12   hdd   3.63199         osd.12                 up  1.00000 1.00000
>  14   hdd   3.63199         osd.14                 up  1.00000 1.00000
>  16   hdd   3.63199         osd.16                 up  1.00000 1.00000
>  19   hdd   3.63199         osd.19                 up  1.00000 1.00000
>  21   hdd   3.63199         osd.21                 up  1.00000 1.00000
>  23   hdd   3.63199         osd.23                 up  1.00000 1.00000
>  26   hdd   3.63199         osd.26                 up  1.00000 1.00000
>  28   hdd   3.63199         osd.28                 up  1.00000 1.00000
>  30   hdd   3.63199         osd.30                 up  1.00000 1.00000
>  33   hdd   3.63199         osd.33                 up  1.00000 1.00000
>  37   hdd   3.63199         osd.37                 up  1.00000 1.00000
>  40   hdd   3.63199         osd.40                 up  1.00000 1.00000
>  42   hdd   3.63199         osd.42                 up  1.00000 1.00000
>  44   hdd   3.63199         osd.44                 up  1.00000 1.00000
>  47   hdd   3.63199         osd.47                 up  1.00000 1.00000
>  49   hdd   3.63199         osd.49                 up  1.00000 1.00000
>  51   hdd   3.63199         osd.51                 up  1.00000 1.00000
>  54   hdd   3.63199         osd.54                 up  1.00000 1.00000
>  56   hdd   3.63199         osd.56                 up  1.00000 1.00000
>  57   hdd   3.63199         osd.57                 up  1.00000 1.00000
>  59   hdd   3.63199         osd.59                 up  1.00000 1.00000
>  61   hdd   3.63199         osd.61                 up  1.00000 1.00000
>  63   hdd   3.63199         osd.63                 up  1.00000 1.00000
>  71   hdd   3.63199         osd.71                 up  1.00000 1.00000
>  73   hdd   3.63199         osd.73                 up  1.00000 1.00000
>  75   hdd   3.63199         osd.75                 up  1.00000 1.00000
>  78   hdd   3.63199         osd.78                 up  1.00000 1.00000
>  80   hdd   3.63199         osd.80                 up  1.00000 1.00000
>  81   hdd   3.63199         osd.81                 up  1.00000 1.00000
>  83   hdd   3.63199         osd.83                 up  1.00000 1.00000
>  84   hdd   3.63199         osd.84                 up  1.00000 1.00000
>  90   hdd   3.63199         osd.90                 up  1.00000 1.00000
>  91   hdd   3.63199         osd.91                 up  1.00000 1.00000
>  93   hdd   3.63199         osd.93                 up  1.00000 1.00000
>  96   hdd   3.63199         osd.96                 up  1.00000 1.00000
>  98   hdd   3.63199         osd.98                 up  1.00000 1.00000
> 100   hdd   3.63199         osd.100                up  1.00000 1.00000
> 102   hdd   3.63199         osd.102                up  1.00000 1.00000
> 105   hdd   3.63199         osd.105                up  1.00000 1.00000
> 106   hdd   3.63199         osd.106                up  1.00000 1.00000
> 108   hdd   3.63199         osd.108                up  1.00000 1.00000
> 110   hdd   3.63199         osd.110                up  1.00000 1.00000
> 115   hdd   3.63199         osd.115                up  1.00000 1.00000
> 116   hdd   3.63199         osd.116                up  1.00000 1.00000
> 121   hdd   3.63199         osd.121                up  1.00000 1.00000
> 123   hdd   3.63199         osd.123                up  1.00000 1.00000
> 132   hdd   3.63199         osd.132                up  1.00000 1.00000
>  -3       196.14078     host ceph-storage3
>   2   hdd   3.63199         osd.2                  up  1.00000 1.00000
>   9   hdd   3.63199         osd.9                  up  1.00000 1.00000
>  17   hdd   3.63199         osd.17                 up  1.00000 1.00000
>  24   hdd   3.63199         osd.24                 up  1.00000 1.00000
>  31   hdd   3.63199         osd.31                 up  1.00000 1.00000
>  38   hdd   3.63199         osd.38                 up  1.00000 1.00000
>  45   hdd   3.63199         osd.45                 up  1.00000 1.00000
>  53   hdd   3.63199         osd.53                 up  1.00000 1.00000
>  58   hdd   3.63199         osd.58                 up  1.00000 1.00000
>  64   hdd   3.63199         osd.64                 up  1.00000 1.00000
>  69   hdd   3.63199         osd.69                 up  1.00000 1.00000
>  77   hdd   3.63199         osd.77                 up  1.00000 1.00000
>  82   hdd   3.63199         osd.82                 up  1.00000 1.00000
>  86   hdd   3.63199         osd.86                 up  1.00000 1.00000
>  88   hdd   3.63199         osd.88                 up  1.00000 1.00000
>  95   hdd   3.63199         osd.95                 up  1.00000 1.00000
> 103   hdd   3.63199         osd.103                up  1.00000 1.00000
> 109   hdd   3.63199         osd.109                up  1.00000 1.00000
> 113   hdd   3.63199         osd.113                up  1.00000 1.00000
> 120   hdd   3.63199         osd.120                up  1.00000 1.00000
> 127   hdd   3.63199         osd.127                up  1.00000 1.00000
> 134   hdd   3.63199         osd.134                up  1.00000 1.00000
> 140   hdd   3.63869         osd.140                up  1.00000 1.00000
> 141   hdd   3.63199         osd.141                up  1.00000 1.00000
> 143   hdd   3.63199         osd.143                up  1.00000 1.00000
> 144   hdd   3.63199         osd.144                up  1.00000 1.00000
> 145   hdd   3.63199         osd.145                up  1.00000 1.00000
> 146   hdd   3.63199         osd.146                up  1.00000 1.00000
> 147   hdd   3.63199         osd.147                up  1.00000 1.00000
> 148   hdd   3.63199         osd.148                up  1.00000 1.00000
> 149   hdd   3.63199         osd.149                up  1.00000 1.00000
> 150   hdd   3.63199         osd.150                up  1.00000 1.00000
> 151   hdd   3.63199         osd.151                up  1.00000 1.00000
> 152   hdd   3.63199         osd.152                up  1.00000 1.00000
> 153   hdd   3.63199         osd.153                up  1.00000 1.00000
> 154   hdd   3.63199         osd.154                up  1.00000 1.00000
> 155   hdd   3.63199         osd.155                up  1.00000 1.00000
> 156   hdd   3.63199         osd.156                up  1.00000 1.00000
> 157   hdd   3.63199         osd.157                up  1.00000 1.00000
> 158   hdd   3.63199         osd.158                up  1.00000 1.00000
> 159   hdd   3.63199         osd.159                up  1.00000 1.00000
> 161   hdd   3.63199         osd.161                up  1.00000 1.00000
> 162   hdd   3.63199         osd.162                up  1.00000 1.00000
> 164   hdd   3.63199         osd.164                up  1.00000 1.00000
> 165   hdd   3.63199         osd.165                up  1.00000 1.00000
> 167   hdd   3.63199         osd.167                up  1.00000 1.00000
> 168   hdd   3.63199         osd.168                up  1.00000 1.00000
> 169   hdd   3.63199         osd.169                up  1.00000 1.00000
> 170   hdd   3.63199         osd.170                up  1.00000 1.00000
> 171   hdd   3.63199         osd.171                up  1.00000 1.00000
> 172   hdd   3.63199         osd.172                up  1.00000 1.00000
> 173   hdd   3.63199         osd.173                up  1.00000 1.00000
> 174   hdd   3.63869         osd.174                up  1.00000 1.00000
> 177   hdd   3.63199         osd.177                up  1.00000 1.00000
>
>
>
> # Ceph configuration shared by all nodes
>
>
> [global]
> fsid = 6a2e8f21-bca2-492b-8869-eecc995216cc
> public_network = 10.141.0.0/16
> cluster_network = 10.85.8.0/22
> mon_initial_members = ceph-p-mon1, ceph-p-mon2, ceph-p-mon3
> mon_host = 10.141.161.248,10.141.160.250,10.141.167.237
> auth_cluster_required = cephx
> auth_service_required = cephx
> auth_client_required = cephx
>
>
> # Cephfs needs these to be set to support larger directories
> mds_bal_frag = true
> allow_dirfrags = true
>
> rbd_default_format = 2
> mds_beacon_grace = 60
> mds session timeout = 120
>
> log to syslog = true
> err to syslog = true
> clog to syslog = true
>
>
> [mds]
>
> [osd]
> osd op threads = 32
> osd max backfills = 32
>
>
>
>
>
> # Old method of moving ssds to a pool
>
> [osd.85]
> host = ceph-storage1
> crush_location =  root=ssds host=ceph-storage1-ssd
>
> [osd.89]
> host = ceph-storage1
> crush_location =  root=ssds host=ceph-storage1-ssd
>
> [osd.160]
> host = ceph-storage3
> crush_location =  root=ssds host=ceph-storage3-ssd
>
> [osd.163]
> host = ceph-storage3
> crush_location =  root=ssds host=ceph-storage3-ssd
>
> [osd.166]
> host = ceph-storage3
> crush_location =  root=ssds host=ceph-storage3-ssd
>
> [osd.5]
> host = ceph-storage2
> crush_location =  root=ssds host=ceph-storage2-ssd
>
> [osd.68]
> host = ceph-storage2
> crush_location =  root=ssds host=ceph-storage2-ssd
>
> [osd.87]
> host = ceph-storage2
> crush_location =  root=ssds host=ceph-storage2-ssd
>
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20181103/157f5878/attachment.html>


More information about the ceph-users mailing list