December 15, 2014

Dec 15: Putting the fast in FastMail: Loading your mailbox quickly

Historical

Rob Mueller

Founder & CTO

This blog post is part of the FastMail 2014 Advent Calendar.

The previous post on 14th December was about our 24/7 monitoring and paging system. The following post on 16 December addresses confidentiality and where our servers are hosted.

Technical level: medium

From the moment you log in to your account (or go to the FastMail website and already have a logged in session), we obviously want to load your mailbox as fast as possible. To make that possible, we’ve implemented lots of small and large optimisations and features.

Use fast, reliable and consistent hardware

As we talked about in our email servers post, part of achieving fast performance is the hardware you choose.

In the case of email, the most important thing is making sure you have a fast disk system with lots of input/output (read/write) operations per second (IOPs). It’s worth noting that there is a massive difference between consumer SSDs and enterprise SSDs. Intel rate their enterprise 400 GB DC3700 as being able to handle 7.30 PB (petabytes) of data written. For comparison, a fairly good quality consumer drive like the Samsung 850 EVO is rated at 150 TB (terabytes), that’s 1/50th the write capacity rating. When you’re running a server 24/7 with thousands of reads and writes per second, every second, every day, every month, for years, you need very high reliability.

As well as high data reliability, enterprise drives also retain their IOPs rating for much longer. Many consumer drives advertise a very high IOPs rate. However that rate is only achieved while the drive is new (or has been hard erased). After you start writing data to the drive, the IOPs rate often drops off significantly. An enterprise drive like the DC3700 might not have as high an IOPs rating on paper, but it is much more consistent throughout its lifetime.

To maximise the usage of this hardware, it’s important to split your data correctly. The most commonly accessed data needs to be on the fast SSD drives, but you need to keep the cost down for the large archival data of emails from last week/month/year that most people don’t access that often.

Start pre-loading data as soon as possible

When you submit the login page and we authenticate you, we immediately send a special command down to the IMAP server to start pre-loading your email data (under the hood, this uses the posix_fadvise system call to start loading message indexes and other databases from disk). We then return a redirect response to your web browser, which directs your browser to start loading the web application.

This means that during the time it takes to return the redirect response to your browser, and while the browser starts loading the web application, in the background we’re already starting to load your email data from disk into memory.

Minimise, compress and cache the web application

To make loading the web application as quick as possible, we employ a number of techniques.

Minimisation - this is the automated rewriting of CSS and JavaScript
to use shorter variable names and strip comments and whitespace in
order to produce code that is smaller (and so quicker to download)
but works exactly the same. This is standard practice and we use
uglifyjs on all our code
Compression - This is again standard practice. We use gzip to
pre-compress all files and the nginx
gzip_static
extension so we can store the compressed files on disk and serve
them directly when a browser supports gzip (almost all of them).
This means we can immediately serve the file rather than having to
use CPU time to compress it for each download. It also means we can
use the maximum gzip compression level to squeeze out every extra
byte we can, as we only need to do the compression once rather than
many times.
Concatenation - we join together all code and data files for a
module into a single download. This includes not just javascript
code, but CSS styles, icons, fonts and other information encoded
into the javascript itself. This significantly reduces the number of
separate connections (which each require a SSL/TLS negotiation, TCP
window scale up to get to maximum speed, etc) and downloads, and
allows as much data as possible to be streamed in one go
Modularisation - we build our code into several core code blocks.
There is one main block for the application framework (something to
talk about in another post), and one block for each part of the
application (e.g. mailbox, contacts, calendar, settings, etc). So
loading the mailbox application requires downloading a few files;
the bootstrap loader, the application framework, the mailbox
application and some localisation data. No styles, images, fonts or
other content are needed.
Caching - each file downloaded includes a hash code of the content
in the filename. We store the content of each file in the browser’s
local storage area. This makes
it easy for us to see if we already have any particular application
file downloaded. If we do, we don’t even have to request it from the
server, we can just load it directly from the browser local storage.

Multiple API calls in each request

The API the client uses to talk to the server is based on a JSON protocol. It’s not an HTTP based RESTful type API.

The advantage of this is that we can send multiple commands in a single call to the server. This avoids multiple round trips back and forth to request data, which significantly slows things down on high latency connections (e.g. mobile devices, people a long way away from the server, etc).

For instance, our web client requests the users preferences, a list of user personalities, a list of user mailbox folders, all mailbox folder message/conversation counts, a list of user saved searches, the initial Inbox mailbox message/conversation list, and the users overall mailbox quota state, all in a single call to the server. The server can gather all this data together in one go, and return it to the client in a single call result.

Pre-calculate as much as possible

Certain pieces of data can be expensive to calculate. One significant example of this is message previews. Message meta data/headers such as the From address, Subject, etc are stored in a separate “cache” file in Cyrus (our IMAP server) and are thus quicker to access, but each message body is stored in its own separate file. This means that generating the previews for 20 messages would normally involve opening 20 separate files, reading and parsing out the right bits, etc.

To avoid this, we calculate a preview of each message when it’s delivered to the mailbox using an annotator process, and store that in a separate annotation database.

So fetching the preview for 20 messages now only requires reading from a single database file, and we can start pre-loading that file immediately after login as well.

There’s no one single thing that makes FastMail fast, it’s a combination of features, and we’re always working to tune those features even more to make the FastMail experience as fast as possible. When you call yourself FastMail, you’re really signalling one of your core features in the name, and that means we have to always think about it in everything we do.