digg comments on btrfs
Friday, July 20, 2007
A friend pointed out that a reference to btrfs appeared on digg. I wasn’t sure that it merited much attention but a colleague expressed interest in learning more about btrfs.
I should first set the stage by explaining my relation to btrfs. Chris Mason, its primary developer, is my manager at Oracle. He and I started working on btrfs quite a few months ago. I fell back into a more advisory role after I moved on to work on a related project while Chris continued working diligently on the initial btrfs implementation. While I’m not intimately familiar with the code, I’m pretty familiar with the design trade-offs that it currently makes.
I’ll address some of the honest confusion expressed in the comments to that digg post by translating them into questions that one might ask while not suffering from the effects of John Gabriel’s GIF Theory
btrfs isn’t considered stable and isn’t supported. That scares me. Why is btrfs available before it is feature-complete and stable?
Once a file system is complete and supported it becomes very hard to work in features that weren’t originally available. Adding new features that require changes to the format of persistent data on disk becomes much, much, harder. By making it available at this stage we give people the opportunity to request features that might not have occurred to us. All file systems go through this stage, we’re just exposing it to a wider group of people. One is always welcome to simply ignore btrfs until it’s supported if that’s what one desires.
I live a very busy life and couldn’t be bothered to look at the license that btrfs is released under and instead chose to imply that it wasn’t free and open. Was this not the most clever thing I’ve done recently?
Probably. btrfs is released under the GPLv2, the same license as the Linux kernel.
For whatever reason, I have a negative impression of software that is related to the word Oracle. Should I transfer that negativity to btrfs because it is also associated with the word Oracle?
Probably not. The kernel development team at Oracle that produces btrfs is made up of people who worked on the Linux kernel long before they agreed to come work on the kernel for Oracle. Never fear, we tend to work from home in distant states, countries, and continents — far from the influence of whatever magical anti-awesome sauce it is that you think Oracle puts in its developers’ food.
Oracle also developed OCFS2. Are the two projects related?
Not really, although I worked on OCFS2 for a time. The two file systems solve different problems and their development efforts have different resources at their disposal. OCFS2 is about helping multiple machines work on a shared file system without corrupting each others’ efforts. That’s incredibly difficult. btrfs is about making the best of modern file system features available to the majority of Linux installations for the simple case where there’s only one computer using it. That’s relatively less difficult.
btrfs is a new file system. I also know of another new file system, ZFS. Does btrfs make ZFS unneccessary?
I can think of no way in which a current ZFS user would be satisfied by btrfs. If for no other reason than the simple fact that btrfs is not supported anywhere and ZFS is not seriously available to Linux users. Maybe one could entertain having this conversation once btrfs is supported on Linux and Solaris and ZFS is supported on Linux.
All this talk of ZFS and btrfs reminds me that I once heard that ZFS can be slow, or something. Might that also be said of btrfs?
Yes, in as much as that can be said of each and every file system in existence. File system engineering is, at it’s core, a game of having to choose amongst conflicting desires. It’s often the case that implementing a feature in a particular way will benefit one usage pattern while harming some other usage pattern. btrfs and ZFS, both incorporating design elements more modern than the Reagan administration, will tend to chose to skew the trade-offs in similar directions, most of the time.
There are already lots (and lots) of file systems available for Linux. What does btrfs do that those file systems don’t?
Sometimes it can be hard for those of us who work on file systems to clearly communicate why it is that we dislike existing designs. It’s complicated stuff. There’s one property of current Linux file systems, though, that seems like it should be universally ill-received.
Almost all Linux file systems provide almost no protection against data corruption. The only protection they offer is to propagate errors from the storage system up to the application. If the storage system doesn’t realize that the data has been corrupted, perhaps because the corruption happened after the drive, these file systems can get very confused. Returning bad data to applications, overwriting the wrong data on disk, crashing machines, etc.
Now, storage systems have been surprisingly reliable, it turns out. But Linux thrives on cheap commodity hardware, which is not exactly famous for being rock solid. The persistent march of hardware towards commoditization and cheaper manufacturing does not bode well for the future.
That btrfs takes strong measures to address the risk of corruption is the most exciting run-time feature for me. I want flakey hardware to result in a console message indicating data corruption, not mysterious behaviour or kernel panics that some incredibly expensive human has to diagnose.
I mean, no one would ever consider disabling checksumming in TCP. Why on earth do we allow our file systems to operate without similar protection?