[ceph-users] Huge latency spikes

Alex Litvak alexander.v.litvak at gmail.com
Sat Nov 17 14:52:02 PST 2018


Plot thickens:

I checked c-states and apparently I am operating in c1 with all CPUS on.  Apparently servers were tuned to use latency-performance

  tuned-adm active
Current active profile: latency-performance

turbostat shows
  Package    Core     CPU Avg_MHz   %Busy Bzy_MHz TSC_MHz     SMI  CPU%c1  CPU%c3  CPU%c6  CPU%c7 CoreTmp  PkgTmp Pkg%pc2 Pkg%pc3 Pkg%pc6 Pkg%pc7 PkgWatt RAMWatt   PKG_%   RAM_%
        -       -       -      22    0.84    2600    2400       0   99.16    0.00    0.00    0.00      49      58    0.00    0.00    0.00    0.00   69.51   17.29    0.00    0.00
        0       0       0      39    1.52    2600    2400       0   98.48    0.00    0.00    0.00      48      58    0.00    0.00    0.00    0.00   36.30    8.73    0.00    0.00
        0       0      12      15    0.56    2600    2400       0   99.44
        0       1       2      47    1.81    2600    2400       0   98.19    0.00    0.00    0.00      49
        0       1      14      17    0.66    2600    2400       0   99.34
        0       2       4      31    1.20    2600    2400       0   98.80    0.00    0.00    0.00      47
        0       2      16      18    0.71    2600    2400       0   99.29
        0       3       6      31    1.21    2600    2400       0   98.79    0.00    0.00    0.00      49
        0       3      18      39    1.50    2600    2400       0   98.50
        0       4       8      33    1.27    2600    2400       0   98.73    0.00    0.00    0.00      46
        0       4      20      17    0.64    2600    2400       0   99.36
        0       5      10      32    1.23    2600    2400       0   98.77    0.00    0.00    0.00      48
        0       5      22      20    0.76    2600    2400       0   99.24
        1       0       1      25    0.95    2600    2400       0   99.05    0.00    0.00    0.00      44      52    0.00    0.00    0.00    0.00   33.21    8.56    0.00    0.00
        1       0      13       9    0.34    2600    2400       0   99.66
        1       1       3       9    0.35    2600    2400       0   99.65    0.00    0.00    0.00      42
        1       1      15      11    0.42    2600    2400       0   99.58
        1       2       5      30    1.17    2600    2400       0   98.83    0.00    0.00    0.00      46
        1       2      17       7    0.28    2600    2400       0   99.72
        1       3       7      10    0.40    2600    2400       0   99.60    0.00    0.00    0.00      44
        1       3      19      10    0.37    2600    2400       0   99.63
        1       4       9       9    0.36    2600    2400       0   99.64    0.00    0.00    0.00      45
        1       4      21       7    0.27    2600    2400       0   99.73
        1       5      11      12    0.45    2600    2400       0   99.55    0.00    0.00    0.00      45
        1       5      23      46    1.76    2600    2400       0   98.24

iostat for ssd shows

# iostat -xd -p sdb 1 1000

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sdb               0.00     0.00    0.05   26.78     0.20  2299.53   171.42     0.02    0.64    0.11    0.64   0.08   0.20

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sdb               0.00     0.00    0.00   16.00     0.00   392.00    49.00     0.00    0.06    0.00    0.06   0.06   0.10

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sdb               0.00     0.00    0.00   74.00     0.00   880.00    23.78     0.00    0.00    0.00    0.00   0.00   0.00

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sdb               0.00     0.00    0.00   56.00     0.00   240.00     8.57     0.00    0.00    0.00    0.00   0.00   0.00

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sdb               0.00     0.00    0.00   44.00     0.00   676.00    30.73     0.00    0.07    0.00    0.07   0.05   0.20

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sdb               0.00     0.00    0.00   10.00     0.00    92.00    18.40     0.00    0.00    0.00    0.00   0.00   0.00

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sdb               0.00     0.00    0.00    6.00     0.00    84.00    28.00     0.00    0.00    0.00    0.00   0.00   0.00

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sdb               0.00     0.00    0.00    1.00     0.00    20.00    40.00     0.00    0.00    0.00    0.00   0.00   0.00

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sdb               0.00     0.00    0.00   25.00     0.00   212.00    16.96     0.00    0.00    0.00    0.00   0.00   0.00

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sdb               0.00     0.00    0.00   14.00     0.00   100.00    14.29     0.00    0.00    0.00    0.00   0.00   0.00

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sdb               0.00     0.00    0.00    5.00     0.00   112.00    44.80     0.00    0.00    0.00    0.00   0.00   0.00

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sdb               0.00     0.00    0.00   13.00     0.00   508.00    78.15     0.00    0.15    0.00    0.15   0.15   0.20

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sdb               0.00     0.00    0.00   49.00     0.00   820.00    33.47     0.01    0.10    0.00    0.10   0.08   0.40

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sdb               0.00     0.00    0.00    7.00     0.00    52.00    14.86     0.00    0.00    0.00    0.00   0.00   0.00

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sdb               0.00     0.00    0.00   18.00     0.00   180.00    20.00     0.00    0.06    0.00    0.06   0.06   0.10

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sdb               0.00     0.00    0.00   34.00     0.00   476.00    28.00     0.00    0.06    0.00    0.06   0.06   0.20

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sdb               0.00     0.00    1.00   12.00     4.00   156.00    24.62     0.00    0.00    0.00    0.00   0.00   0.00

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sdb               0.00     0.00    0.00   32.00     0.00   940.00    58.75     0.00    0.03    0.00    0.03   0.03   0.10

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sdb               0.00     0.00    0.00   13.00     0.00   456.00    70.15     0.00    0.00    0.00    0.00   0.00   0.00

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sdb               0.00     0.00    0.00   37.00     0.00   536.00    28.97     0.00    0.00    0.00    0.00   0.00   0.00

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sdb               0.00     0.00    0.00    6.00     0.00    60.00    20.00     0.00    0.17    0.00    0.17   0.17   0.10

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sdb               0.00     0.00    0.00    3.00     0.00    48.00    32.00     0.00    0.00    0.00    0.00   0.00   0.00

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sdb               0.00     0.00    0.00   10.00     0.00  1452.00   290.40     0.00    0.30    0.00    0.30   0.20   0.20


On 11/17/2018 3:42 PM, John Petrini wrote:
> You can check if cstates are enabled with cat /proc/acpi/processor/info. Look for power management: yes/no.||
> 
> If they are enabled then you can check the current cstate of each core. 0 is the CPU's normal operating range, any other state means the processor is in a power saving mode. cat 
> /proc/acpi/processor/CPU?/power.
> 
> cstates are configured in the bios so a reboot is required to change them. I know with Dell servers you can trigger the change with omconfig and then issue a reboot for it to take effect. Otherwise 
> you'll need to disable it directly in the bios.
> 
> As for the SSD's I would just run iostat and check the iowait. If you see small disk writes causing high iowait then your SSD's are probably at the end of their life. Ceph journaling is good at 
> destroying SSD's.
> 
> 
> _______________________________________________
> ceph-users mailing list
> ceph-users at lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 




More information about the ceph-users mailing list