Dec 15: Platform Team working agreement

Post categories

Profile picture for Bron Gondwana

CEO

This is the fifteenth post in the Fastmail Advent 2024 series. The previous post was Dec 14: On-call systems. The next post is Dec 16: Offline support now in public beta.

Another Sunday, another document from our internal collection!

Over the past couple of years, we’ve been taking the bits that we like from scrum, agile, and all the other buzzword methodologies that are out there. One of the best things that all the good frameworks to is give a space for setting behaviours for ourselves and each other within a team — so you know what to expect from your colleagues, and also what they expect of you!

I posted about our Mission Statement and Guiding Principles already, now we narrow down to look at a single team. Our platform team maintains the infrastructure that the Fastmail product runs on top of.

We started from here:

Principles and duties

We are lifeguards:

  • We watch out for upcoming risks and for systems in distress.
  • Part of our day is keeping a watchful eye and scanning for strangeness and hints of something going down.

We are first-aiders:

  • When something goes wrong, our first principle is to keep the patient alive
  • For Fastmail: the patient is the data stored on our systems; the data we store on the behalf of others. This is a higher priority than uptime.

We are paramedics:

  • We do more than just keep the patient alive! We perform surgery and repairs to the level of our ability.
  • For more complex things, we stabilise and bring things to the specialists in that system to make more permanent repairs.
  • We get the service back up for as many as possible, as quickly as possible, without compromising data.

We are specialists:

  • For the basics of first aid and early response, we all need to become experts - and we all need to be vigilant.
  • But; we all have our areas of specialty, and it’s OK to lean into that!

The duties of a platformer:

  • Monitoring - looking at the alert emails, the metrics, etc and making sure things are nominal.
  • First-aid first - if the platform is in crisis, drop everything and fix it. This is the oncall duty, we all do this.
  • Janitorial - mop the floors, oil the joints, replace worn parts. We all do this, on rotation and as we see things. We keep the ship ship-shape.
  • Initiatives - look into possible improvements, experiment with possibilities, and build the future. Everyone should have an initiative they are leading or collaborating on. These make progress in the time we’re not doing first-aid or janitorial work.

A platform team shouldn’t be particularly busy. The first-aid and janitorial work should not be a large chunk of your day most of the time.

Everyone on the platform team is an experienced and senior Platform Engineer, though most of the current team are quite new to Fastmail, which is why I’m also embedded with them as the crusty old expert who knows where many of the bodies are buried!

Since the team is split between USA and Australia; we only spend an hour of dedicated synchronous time per week. So we brought everyone together in Melbourne in November, and came up with the following schedule and working agreement:

Over each 4 week period, this includes two set of task list gardening, two technical deep-dive sessions, a retrospective where we can change how we operate as a team! One member of the team runs all the ceremonies for a 4 week period, and hands off after the retrospective.

Working Agreement

We value action, forward progress and take ownership of tasks

We learn from each others successes and mistakes, and use good processes to protect ourselves from human error, always asking how we can be better.

We give and seek feedback, and encourage asking questions (no dumb questions)

We generate discussion and directional buy in, and make time per site for mobbing/pairing and getting clarity on solutions.

We celebrate our successes and show them off to the rest of the company.

To resolve conflict we:

  • Check with the customer (the Fastmail product team) - what do they need?
  • Check with an area expert
  • Use experiments

To agree things we:

  • Make solo calls if comfortable; or
  • Bring it ‘To Discuss’ - the fortnightly deep dive meeting
Profile picture for Bron Gondwana

CEO