Working with Drupal and Web File Systems in the Real World

Mike Booth, Senior Cloud Software Engineer at Acquia, on tackling concrete problems, file systems, real world Drupal -- and the value of incremental improvement. Part 1 in a series.

Mike Booth started out in electrical engineering, earning a PhD at Cornell, where he trained in laser physics and semiconductor manufacturing. But for most of his career he’s worked in Web programming.

Booth says he’s tinkered with “every layer of the LAMP stack: AWS provisioning and configuration, Ubuntu package management, Apache, Nginx, Varnish, PHP, Ruby/Passenger, Git and SVN servers, MySQL replication, GlusterFS-based distributed file systems.” Mike was also a senior member of the team that designed, developed, and launched Acquia Cloud, the AWS-based platform-as-a-service for hosting Drupal-based websites.

Today, Booth’s background in experimental physics continues to influence his approach.

“Experimental physics is a wonderful subject,” he wrote. “You learn to approach theorists with the proper balance of respect and suspicion. You learn about double-stick tape, and when to apply it to a fifty-thousand-dollar laser system. You learn that no procedure or apparatus is too simple to fail. You learn that to make working things, you must practice the art of repairing broken things.”

Below, check out Booth’s thoughts on file systems and other topics. The interview was conducted and redacted by DC Denison, who interjects the occasional question or comment under the code name “Q.”

File Systems and Real World Drupal

Drupal is a very pragmatic platform, traditionally. It’s not stuck on academic considerations. Everything that’s there is there because someone put it there, because they needed it in production, or they needed it to do their job. There’s an active dialogue about publishing websites, as opposed to esoteric aspects of software design.

As a file system guy, I’m a consumer of the product. I’m not re-engineering filesystems at a deep level, but I have to try and look at them from our viewpoint and our customer’s viewpoint and find solutions to the real problems.

It’s like, “Well, in an imaginary world, I could snap my fingers and everybody could switch to Amazon S3 (Simple Storage Service).” But no. The project has momentum and the site is live now. We have earlier versions of the thing and they’ve all been designed around earlier assumptions. You have to resist that certain philosophical perk that you get way too much of in engineering: the start-up thing that says, “I want to revolutionize file systems.”

The Challenge of Working at Acquia

The most exciting thing about Acquia to me is that we operate at such scale with such important customers. We deal with real problems that are right in front of us. They are not theoretical. Important things are happening. We are working with real live stuff in production.

Engineers often say, “I’m fond of greenfield things.” The tendency is to say, “I want to work on something that has no customers, where there’s a completely green field and I can do whatever I want.”

Q. It’s like wanting to build a new house rather than renovating an existing one.

Yes, but there’s value in working with people where they are. It means that you end up in a world of compromises and you have to negotiate your way through complicated design problems. They’re less comfortable in the world than they are in your head. At the same time, I really value being able to engage real problems, to touch real things.

I can imagine a lot of elegant ways to design a file system. It’s an academic problem for computer scientists, and it can be super fun.

But in the end, there’s also something beautiful about, “Well, I have this concrete problem and I have to solve the concrete problem for concrete customers who are right in front of me.” You have to use your design tools to target real world websites that are happening right now.

Q. Can you give me an example?

Well, the problem of abstracting a file system. I want to be able to plug one or another of various different file system candidates into our Web hosting platform and be able to use them interchangeably. How well can I hide the background details from the higher levels of the architecture? It’s not an abstract problem. It’s not a problem that I can address in an ivory tower where no one is using it.

The Value of Incremental Improvement

One of the things that’s nice about software is that if you’re not trying to dominate the headlines, but actually work on it, there’s just a ton of stuff to do. You can really help people out. If we get some new feature working, it will really help developers out with their workflows. If we can improve the stability of something by one percent, that doesn’t sound like an exciting thing, but it’s real engineering.

I studied actual engineering in school, the kind where you build semiconductors. I built semiconductor lasers. And if the laser doesn’t work, you don’t get to graduate.

You end up obsessing about things like cleaning. It’s like, “How do I wash this?” and “Did I touch it with the wrong chemical at the wrong time?” It’s all very controlled, and it’s a part of a huge system.

Somewhere, billions of people are carrying phone parts that benefitted from my earlier work. The same sort of thing applies with the work I do at Acquia. It’s like, “Oh, I’ll make this change in the file system and thousands of websites run by companies that employ millions of people are going to enjoy better efficiency. They are going to load better.” The statistics on how Web performance can improve your business are very compelling.

Most of the world works that way. Real improvements happen in tiny pieces, but the pieces pile up. Anything you can improve, even if it seems like a little thing in the corner, has a beautiful downstream effect that is sort of inspiring.

Coming next, in Part 2: the backstory on Drupal's file system, the advantages of Gluster, object storage systems, and "the big gorilla in the room" (Amazon S3).