[ceph-users] ceph-iscsi upgrade issue

Steven Vacaroaia stef97 at gmail.com
Wed Oct 10 06:21:26 PDT 2018


Hi Jason,
Thanks for your prompt responses

I have used same iscsi-gateway.cfg file - no security changes - just added
prometheus entry
There is no iscsi-gateway.conf but the gateway.conf object is created and
has correct entries

iscsi-gateway.cfg is identical and contains the following

[config]
cluster_name = ceph
gateway_keyring = ceph.client.admin.keyring
api_secure = false
trusted_ip_list =
10.10.30.181,10.10.30.182,10.10.30.183,10.10.30.184,10.10.30.185,10.10.30.186
prometheus_host = 0.0.0.0



I am running the disks commands from OSD01 and they fail with the following

INFO [gateway.py:344:load_config()] - (Gateway.load_config) successfully
loaded existing target definition
2018-10-10 09:04:48,956    DEBUG [gateway.py:423:map_luns()] - processing
tpg2
2018-10-10 09:04:48,956    DEBUG [gateway.py:428:map_luns()] - rbd.dstest
needed mapping to tpg2
2018-10-10 09:04:48,958     INFO [gateway.py:403:bind_alua_group_to_lun()]
- Setup group ao for rbd.dstest on tpg 2 (state 0, owner True, failover
type 1)
2018-10-10 09:04:48,958    DEBUG [gateway.py:405:bind_alua_group_to_lun()]
- Setting Luns tg_pt_gp to ao
2018-10-10 09:04:48,959    DEBUG [gateway.py:409:bind_alua_group_to_lun()]
- Bound rbd.dstest on tpg2 to ao
2018-10-10 09:04:48,959    DEBUG [gateway.py:423:map_luns()] - processing
tpg1
2018-10-10 09:04:48,959    DEBUG [gateway.py:428:map_luns()] - rbd.dstest
needed mapping to tpg1
2018-10-10 09:04:48,960     INFO [gateway.py:403:bind_alua_group_to_lun()]
- Setup group ano1 for rbd.dstest on tpg 1 (state 1, owner False, failover
type 1)
2018-10-10 09:04:48,960    DEBUG [gateway.py:405:bind_alua_group_to_lun()]
- Setting Luns tg_pt_gp to ano1
2018-10-10 09:04:48,961    DEBUG [gateway.py:409:bind_alua_group_to_lun()]
- Bound rbd.dstest on tpg1 to ano1
2018-10-10 09:04:48,963     INFO [_internal.py:87:_log()] - 127.0.0.1 - -
[10/Oct/2018 09:04:48] "PUT /api/_disk/rbd.dstest HTTP/1.1" 200 -
2018-10-10 09:04:48,965     INFO [rbd-target-api:1804:call_api()] - _disk
update on 127.0.0.1, successful
2018-10-10 09:04:48,965    DEBUG [rbd-target-api:1789:call_api()] -
processing GW 'osd03'
2018-10-10 09:04:49,039    ERROR [rbd-target-api:1810:call_api()] - _disk
change on osd03 failed with 500
2018-10-10 09:04:49,041     INFO [_internal.py:87:_log()] - 127.0.0.1 - -
[10/Oct/2018 09:04:49] "PUT /api/disk/rbd.dstest HTTP/1.1" 500 -


on OSD03 there is the folowing "error"

 INFO [lun.py:656:add_dev_to_lio()] - (LUN.add_dev_to_lio) Adding image
'rbd.dstest' to LIO
2018-10-10 09:04:49,037    DEBUG [lun.py:666:add_dev_to_lio()] -
control="max_data_area_mb=8"

Amazingly enough, gwcli on OSD03 show the disk created but on OSD01 it does
not
If I restart gwcli on OSD01 , disk is there but it cannot be added to the
host because it image does not exist ???


adding the disk to the hosts failed  with "client masking update" error

disk add rbd.dstest
CMD: ../hosts/<client_iqn> disk action=add disk=rbd.dstest
Client 'iqn.1998-01.com.vmware:test-2d06960a' update - add disk rbd.dstest
disk add for 'rbd.dstest' against iqn.1998-01.com.vmware:test-2d06960a
failed
client masking update failed on osd03. Client update failed

rbd-target-api:1216:_update_client()] - client update failed on
iqn.1998-01.com.vmware:test-2d06960a : Non-existent images ['rbd.dstest']
requested for iqn.1998-01.com.vmware:test-2d06960a

However, the image is listed on gwcli and using rados ls

/disks> ls
o- disks
..........................................................................................................
[150G, Disks: 1]
  o- rbd.dstest
....................................................................................................
[dstest (150G)]

rados -p rbd ls | grep dstest
rbd_id.dstest



I would really appreciate any help / suggestions

Thanks
Steven

On Tue, 9 Oct 2018 at 16:35, Jason Dillaman <jdillama at redhat.com> wrote:

> Anything in the rbd-target-api.log on osd03 to indicate why it failed?
>
> Since you replaced your existing "iscsi-gateway.conf", do your
> security settings still match between the two hosts (i.e. on the
> trusted_ip_list, same api_XYZ options)?
> On Tue, Oct 9, 2018 at 4:25 PM Steven Vacaroaia <stef97 at gmail.com> wrote:
> >
> > so the gateways are up but I have issues adding disks ( i.e. if I do it
> on one gatway it does not show on the other - however, after I restart the
> rbd-target services I am seeing the disks )
> > Thanks in advance for taking the trouble to provide advice / guidance
> >
> > 2018-10-09 16:16:08,968     INFO [rbd-target-api:1804:call_api()] -
> _clientlun update on 127.0.0.1, successful
> > 2018-10-09 16:16:08,968    DEBUG [rbd-target-api:1789:call_api()] -
> processing GW 'osd03'
> > 2018-10-09 16:16:08,987    ERROR [rbd-target-api:1810:call_api()] -
> _clientlun change on osd03 failed with 500
> > 2018-10-09 16:16:08,987    DEBUG [rbd-target-api:1827:call_api()] -
> failed on osd03, applied to 127.0.0.1, aborted osd03. Client update failed
> > 2018-10-09 16:16:08,987     INFO [_internal.py:87:_log()] - 127.0.0.1 -
> - [09/Oct/2018 16:16:08] "PUT
> /api/clientlun/iqn.1998-01.com.vmware:test-2d06960a HTTP/1.1" 500 -
> >
> > On Tue, 9 Oct 2018 at 15:42, Steven Vacaroaia <stef97 at gmail.com> wrote:
> >>
> >> It worked.
> >>
> >> many thanks
> >> Steven
> >>
> >> On Tue, 9 Oct 2018 at 15:36, Jason Dillaman <jdillama at redhat.com>
> wrote:
> >>>
> >>> Can you try applying [1] and see if that resolves your issue?
> >>>
> >>> [1] https://github.com/ceph/ceph-iscsi-config/pull/78
> >>> On Tue, Oct 9, 2018 at 3:06 PM Steven Vacaroaia <stef97 at gmail.com>
> wrote:
> >>> >
> >>> > Thanks Jason
> >>> >
> >>> > adding prometheus_host = 0.0.0.0 to iscsi-gateway.cfg does not work
> - the error message is
> >>> >
> >>> > "..rbd-target-gw: ValueError: invalid literal for int() with base
> 10: '0.0.0.0' "
> >>> >
> >>> > adding prometheus_exporter = false works
> >>> >
> >>> > However I'd like to use prometheus_exporter if possible
> >>> > Any suggestions will be appreciated
> >>> >
> >>> > Steven
> >>> >
> >>> >
> >>> >
> >>> > On Tue, 9 Oct 2018 at 14:25, Jason Dillaman <jdillama at redhat.com>
> wrote:
> >>> >>
> >>> >> You can try adding "prometheus_exporter = false" in your
> >>> >> "/etc/ceph/iscsi-gateway.cfg"'s "config" section if you aren't using
> >>> >> "cephmetrics", or try setting "prometheus_host = 0.0.0.0" since it
> >>> >> sounds like you have the IPv6 stack disabled.
> >>> >>
> >>> >> [1]
> https://github.com/ceph/ceph-iscsi-config/blob/master/ceph_iscsi_config/settings.py#L90
> >>> >> On Tue, Oct 9, 2018 at 2:09 PM Steven Vacaroaia <stef97 at gmail.com>
> wrote:
> >>> >> >
> >>> >> > here is some info from /var/log/messages ..in case someone has
> the time to take a look
> >>> >> >
> >>> >> > Oct  9 13:58:35 osd03 systemd: Started Setup system to export rbd
> images through LIO.
> >>> >> > Oct  9 13:58:35 osd03 systemd: Starting Setup system to export
> rbd images through LIO...
> >>> >> > Oct  9 13:58:35 osd03 journal: Processing osd blacklist entries
> for this node
> >>> >> > Oct  9 13:58:35 osd03 journal: No OSD blacklist entries found
> >>> >> > Oct  9 13:58:35 osd03 journal: Reading the configuration object
> to update local LIO configuration
> >>> >> > Oct  9 13:58:35 osd03 journal: Configuration does not have an
> entry for this host(osd03) - nothing to define to LIO
> >>> >> > Oct  9 13:58:35 osd03 journal: Integrated Prometheus exporter is
> enabled
> >>> >> > Oct  9 13:58:35 osd03 journal: * Running on http://[::]:9287/
> >>> >> > Oct  9 13:58:35 osd03 journal: Removing iSCSI target from LIO
> >>> >> > Oct  9 13:58:35 osd03 journal: Removing LUNs from LIO
> >>> >> > Oct  9 13:58:35 osd03 journal: Active Ceph iSCSI gateway
> configuration removed
> >>> >> > Oct  9 13:58:35 osd03 rbd-target-gw: Traceback (most recent call
> last):
> >>> >> > Oct  9 13:58:35 osd03 rbd-target-gw: File
> "/usr/bin/rbd-target-gw", line 5, in <module>
> >>> >> > Oct  9 13:58:35 osd03 rbd-target-gw:
> pkg_resources.run_script('ceph-iscsi-config==2.6', 'rbd-target-gw')
> >>> >> > Oct  9 13:58:35 osd03 rbd-target-gw: File
> "/usr/lib/python2.7/site-packages/pkg_resources.py", line 540, in run_script
> >>> >> > Oct  9 13:58:35 osd03 rbd-target-gw:
> self.require(requires)[0].run_script(script_name, ns)
> >>> >> > Oct  9 13:58:35 osd03 rbd-target-gw: File
> "/usr/lib/python2.7/site-packages/pkg_resources.py", line 1462, in
> run_script
> >>> >> > Oct  9 13:58:35 osd03 rbd-target-gw: exec_(script_code,
> namespace, namespace)
> >>> >> > Oct  9 13:58:35 osd03 rbd-target-gw: File
> "/usr/lib/python2.7/site-packages/pkg_resources.py", line 41, in exec_
> >>> >> > Oct  9 13:58:35 osd03 rbd-target-gw: exec("""exec code in globs,
> locs""")
> >>> >> > Oct  9 13:58:35 osd03 rbd-target-gw: File "<string>", line 1, in
> <module>
> >>> >> > Oct  9 13:58:35 osd03 rbd-target-gw: File
> "/usr/lib/python2.7/site-packages/ceph_iscsi_config-2.6-py2.7.egg/EGG-INFO/scripts/rbd-target-gw",
> line 432, in <module>
> >>> >> > Oct  9 13:58:35 osd03 rbd-target-gw: File
> "/usr/lib/python2.7/site-packages/ceph_iscsi_config-2.6-py2.7.egg/EGG-INFO/scripts/rbd-target-gw",
> line 379, in main
> >>> >> > Oct  9 13:58:35 osd03 rbd-target-gw: File
> "/usr/lib/python2.7/site-packages/flask/app.py", line 772, in run
> >>> >> > Oct  9 13:58:35 osd03 rbd-target-gw: run_simple(host, port, self,
> **options)
> >>> >> > Oct  9 13:58:35 osd03 rbd-target-gw: File
> "/usr/lib/python2.7/site-packages/werkzeug/serving.py", line 710, in
> run_simple
> >>> >> > Oct  9 13:58:35 osd03 rbd-target-gw: inner()
> >>> >> > Oct  9 13:58:35 osd03 rbd-target-gw: File
> "/usr/lib/python2.7/site-packages/werkzeug/serving.py", line 692, in inner
> >>> >> > Oct  9 13:58:35 osd03 rbd-target-gw: passthrough_errors,
> ssl_context).serve_forever()
> >>> >> > Oct  9 13:58:35 osd03 rbd-target-gw: File
> "/usr/lib/python2.7/site-packages/werkzeug/serving.py", line 480, in
> make_server
> >>> >> > Oct  9 13:58:35 osd03 rbd-target-gw: passthrough_errors,
> ssl_context)
> >>> >> > Oct  9 13:58:35 osd03 rbd-target-gw: File
> "/usr/lib/python2.7/site-packages/werkzeug/serving.py", line 410, in
> __init__
> >>> >> > Oct  9 13:58:35 osd03 rbd-target-gw: HTTPServer.__init__(self,
> (host, int(port)), handler)
> >>> >> > Oct  9 13:58:35 osd03 rbd-target-gw: File
> "/usr/lib64/python2.7/SocketServer.py", line 417, in __init__
> >>> >> > Oct  9 13:58:35 osd03 rbd-target-gw: self.socket_type)
> >>> >> > Oct  9 13:58:35 osd03 rbd-target-gw: File
> "/usr/lib64/python2.7/socket.py", line 187, in __init__
> >>> >> > Oct  9 13:58:35 osd03 rbd-target-gw: _sock = _realsocket(family,
> type, proto)
> >>> >> > Oct  9 13:58:35 osd03 rbd-target-gw: socket.error: [Errno 97]
> Address family not supported by protocol
> >>> >> > Oct  9 13:58:35 osd03 systemd: rbd-target-gw.service: main
> process exited, code=exited, status=1/FAILURE
> >>> >> >
> >>> >> >
> >>> >> > On Tue, 9 Oct 2018 at 13:16, Steven Vacaroaia <stef97 at gmail.com>
> wrote:
> >>> >> >>
> >>> >> >> Hi ,
> >>> >> >> I am using Mimic 13.2 and kernel 4.18
> >>> >> >> Was using gwcli 2.5 and decided to upgrade to latest (2.7) as
> people reported improved performance
> >>> >> >>
> >>> >> >> What is the proper methodology ?
> >>> >> >> How should I troubleshoot this?
> >>> >> >>
> >>> >> >>
> >>> >> >>
> >>> >> >> What I did ( and it broke it) was
> >>> >> >>
> >>> >> >> cd tcmu-runner; git pull ; make && make install
> >>> >> >> cd ceph-iscsi-cli; git pull;python setup.py install
> >>> >> >> cd ceph-iscsi-config;git pull; python setup.py install
> >>> >> >> cd rtslib-fb;git pull;  python setup.py install
> >>> >> >>
> >>> >> >> After a reboot, I cannot start rbd-target-gw and the logs are
> not very helpful
> >>> >> >>  ( Note:
> >>> >> >>     I removed /etc/ceph/iscsi-gateway.cfg and gateway.conf
> object as I wanted to start fresh
> >>> >> >>      /etc/ceph/iscsi-gatway.conf was left unchanged )
> >>> >> >>
> >>> >> >>
> >>> >> >> 2018-10-09 12:47:50,593 [    INFO] - Processing osd blacklist
> entries for this node
> >>> >> >> 2018-10-09 12:47:50,893 [    INFO] - No OSD blacklist entries
> found
> >>> >> >> 2018-10-09 12:47:50,893 [    INFO] - Reading the configuration
> object to update local LIO configuration
> >>> >> >> 2018-10-09 12:47:50,893 [    INFO] - Configuration does not have
> an entry for this host(osd03) - nothing to define to LIO
> >>> >> >> 2018-10-09 12:47:50,893 [    INFO] - Integrated Prometheus
> exporter is enabled
> >>> >> >> 2018-10-09 12:47:50,895 [    INFO] -  * Running on http://
> [::]:9287/
> >>> >> >> 2018-10-09 12:47:50,896 [    INFO] - Removing iSCSI target from
> LIO
> >>> >> >> 2018-10-09 12:47:50,896 [    INFO] - Removing LUNs from LIO
> >>> >> >> 2018-10-09 12:47:50,896 [    INFO] - Active Ceph iSCSI gateway
> configuration removed
> >>> >> >>
> >>> >> >> Many thanks
> >>> >> >> Steven
> >>> >> >>
> >>> >> > _______________________________________________
> >>> >> > ceph-users mailing list
> >>> >> > ceph-users at lists.ceph.com
> >>> >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >>> >>
> >>> >>
> >>> >>
> >>> >> --
> >>> >> Jason
> >>>
> >>>
> >>>
> >>> --
> >>> Jason
>
>
>
> --
> Jason
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20181010/1813e200/attachment.html>


More information about the ceph-users mailing list