[ceph-users] Cephfs Hadoop Plugin and CEPH integration

Aristeu Gil Alves Jr aristeu.jr at gmail.com
Mon Nov 27 09:55:39 PST 2017


Hi.

It's my first post on the list. First of all I have to say I'm new on
hadoop.

We are here a small lab and we have being running cephfs for almost two
years, loading it with large files (4GB to 4TB in size). Our cluster is
with approximately with 400TB with ~75% of usage, and we are planning to
grow a lot.

Until now, we did process most of the files the "serial reading" way. But
now we will try to implement a parallel process on this files and we are
looking on the hadoop plugin as a solution for using mapreduce, or
something like that.

Does the hadoop plugin access cephfs over the network as a normal cluster
or I can install the hadoop's processors on every ceph node and process the
data locally?


Thanks and regards,

--
Aristeu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20171127/a57643eb/attachment.html>


More information about the ceph-users mailing list