Thursday, May 14, 2009

MogileFS is our open source distributed filesystem

http://www.danga.com/mogilefs/

About MogileFS

MogileFS is our open source distributed filesystem. Its properties and features include:

  • Application level -- no special kernel modules required.
  • No single point of failure -- all three components of a MogileFS setup (storage nodes, trackers, and the tracker's database(s)) can be run on multiple machines, so there's no single point of failure. (you can run trackers on the same machines as storage nodes, too, so you don't need 4 machines...) A minimum of 2 machines is recommended.
  • Automatic file replication -- files, based on their "class", are automatically replicated between enough different storage nodes as to satisfy the minimum replica count as requested by their class. For instance, for a photo hosting site you can make original JPEGs have a minimum replica count of 3, but thumbnails and scaled versions only have a replica count of 1 or 2. If you lose the only copy of a thumbnail, the application can just rebuild it. In this way, MogileFS (without RAID) can save money on disks that would otherwise be storing multiple copies of data unnecessarily.
  • "Better than RAID" -- in a non-SAN RAID setup, the disks are redundant, but the host isn't. If you lose the entire machine, the files are inaccessible. MogileFS replicates the files between devices which are on different hosts, so files are always available.
  • Flat Namespace -- Files are identified by named keys in a flat, global namespace. You can create as many namespaces as you'd like, so multiple applications with potentially conflicting keys can run on the same MogileFS installation.
  • Shared-Nothing -- MogileFS doesn't depend on a pricey SAN with shared disks. Every machine maintains its own local disks.
  • No RAID required -- Local disks on MogileFS storage nodes can be in a RAID, or not. It's cheaper not to, as RAID doesn't buy you any safety that MogileFS doesn't already provide.
  • Local filesystem agnostic -- Local disks on MogileFS storage nodes can be formatted with your filesystem of choice (ext3, XFS, etc..). MogileFS does its own internal directory hashing so it doesn't hit filesystem limits such as "max files per directory" or "max directories per directory". Use what you're comfortable with.

MogileFS is not:

  • POSIX Compliant -- you don't run regular Unix applications or databases against MogileFS. It's meant for archiving write-once files and doing only sequential reads. (though you can modify a file by way of overwriting it with a new version) Notes:
    • Yes, this means your application has to specifically use a MogileFS client library to store and retrieve files. The steps in general are 1) talk to a tracker about what you want to put or get, 2) read/write to one of the places it told you you could (it'll pick storage node(s) for you as part of its load balancing), using HTTP GET/PUT
    • We've prototyped a FUSE binding, so you could use MogileFS without application support, but it's not production-ready.
  • Completely portable ... yet -- we have some Linux-isms in our code, at least in the HTTP transport code. Our plan is to scrap that and make it portable, though.

Please see one of these Wiki pages for more information:

Need help? Have Questions?

No comments: