Let's build! A distributed, concurrent editor: Part 1 - Introduction

In this series:

The sun is bright and the sky’s blue when I get to the office in London. As the lift doors ping open and I step out, a pair of heads near my desk swivel and look towards me. A sense of shame and embarrassment immediately forms in the pit of my stomach as I spot Alice and Bob sat waiting for me. I’d forgotten the meeting with them this morning that I’m late for. Alice shoots a death-stare my way, and Bob looks pointedly at his watch. I hurry over and mutter profuse apologies: I didn’t sleep well due to worry over a board meeting later today. Quickly getting down to business, I ask: “How can I help?”

Alice and Bob want to write a document together. But they don’t want to be sat next to each other to do this - they want to be able to work apart, but on the same document. They’ve tried emailing around documents but that always gets into a mess - they both now have several files called the_document_version_3 and they really don’t know which is the actual version 3. So they want something where they can see instantly the changes that each other is making and both make changes at the same time.

I briefly interrupt to query this use of the word instantly:

“If you’re involving computers and networks, then it’s very likely it’s going to take tens of milliseconds to relay changes from one to the other, and probably hundreds of milliseconds if you’re geographically far apart.”

They think this is probably OK: if this is as fast as it can be then it’ll have to do, and those numbers sound small enough that it’s unlikely to be too detrimental to the interactive editing experience.

Digging into this a bit further, I present them with a scenario:

“Let’s say”, says I, “that you’re both looking at the document on your own computers, and you each have your cursor at the same place in the document. At the exact same moment, one of you types x and the other types y. What should happen?”

The discussion bounces around for a while - it’s clear there is no one correct answer. In the end we agree it most likely doesn’t matter too much, provided the following is true:

  • Either the x or the y or both should be added to the document at the cursor position (and nothing else!), and if both, any order is acceptable.
  • Once Alice’s computer and Bob’s computer have received updates from each other, they should both see the exact same document.

They point out to me that they’re colleagues, and want to collaborate. Their nature is that if they see one of them is editing a particular word or sentence, the other is likely to wait to see what changes are made, rather than wade in with fists full of fury and hammer on the keyboard without thought to what the other person is doing. So to them, the scenario I posed isn’t likely to be that common. I smile internally: I rather suspect they’ve never tried to work with Eve, down in security.

By now, my computer has booted up and I do some quick searching online. It turns out that Google docs has now been sunsetted, and Dropbox paper was never invented, nor several other tools you might be thinking of. I realise I’m going to have to build this myself.

I query them on the notion of saving the document - earlier in the discussion they’d mentioned emailing documents around so I want to know how and where their document should be saved.

Obviously we want it to be saved in the cloud”, Alice asserts. They look at me like I’m from the past. Right. So the cloud exists, but Dropbox doesn’t. Strange world, but fine, whatever! I half nod, half shrug, and suggest:

“This means there will need to be some sort of server which will provide persistent storage for the document.”

“That sounds fine”, says Bob, not caring one iota. “Also, it should save instantly. Like, every key-press. We should never have to think about saving.”

I wonder what else they might want: the 2nd moon on a stick perhaps? Thankfully the words never leave my mouth. My thoughts turn to disconnected operation and the impact of network disconnects.

“Erm, are you both planning on working on this document when you’re traveling? On trains, aeroplanes - stuff like that?”

“Absolutely”. They both nod vigorously.

“OK, so if you’re on a plane then you can’t be connected to a network, and if you’re on a train and you go through the countryside or a tunnel or something, you may lose connectivity.”

Blank stares.

“So”, I continue, “that means if you’re both typing and you don’t have connectivity, there is no way I can relay changes between you both, nor store your changes on the server.”

This foxes them a little and a fair amount of discussion and debate follows once again. I want to push them towards a if you’re not connected then you can’t edit the document at all world, but they don’t like that - they definitely want to be able to edit the document when on long flights. Looking at my watch I realise we’re fast running out of time for this meeting, but I was late so let’s overrun for a bit if necessary.

“How about you can edit it, and the edits show up on your computer immediately, and once you reconnect to the network it will try to sync with the server?” They’re both looking hopefully at me; so far so good. “But, there’s a chance that if the other one of you has edited the same part of the document in the mean time then you might have a conflict and maybe some changes get lost?”. The hopeful expressions seem diminished.

“I think we can work with that”, Alice suggests cautiously. “If I’m going to do a large amount of work on a section of the document on a flight, I’ll just have to let Bob know my plans and he’ll know not to touch that area until I say I’m done.” Bob looks less convinced but I guess my face is putting out several expressions too, so he takes the diplomatic route: “How about we start with this, see how it goes, and if we don’t like it then we revisit this issue later?” We all nod in agreement, though as is often the way, with different agendas internally.

“How do you want to edit this document?” I ask. “Is a browser-based UI OK?” I regret the words before they’ve even formed in my mouth. Why did I just say that? There is nothing that drives me up the wall more than trying to program browsers. Horrible misdesigned things.

“Yeah sure, that’d be great” says Bob. He doesn’t care. Nor does Alice. No one actually wants to use a browser. They just want to write their document. But I’ve gone and said it now, like the idiot I am.

They begin to get up to leave. “Oh yes, one other thing,” Alice starts. Somehow I already know I’m being ambushed. Alice looks at Bob: “Oh right! Yes! OK, so you know how we said earlier that we’d been emailing around Word documents?”

“Well I don’t think you mentioned they were Word documents, but sure, go on.”

“Well, it’s really useful that the undo (and redo) history is saved inside the document.”

“Exactly!” Alice joins back in. “We really want the undo/redo history to be part of the document and saved too. Thanks! Great meeting!”

“Try not to be late next time. Bye!” Nice rejoinder there from Bob. Not going to let my tardiness go then is he? And like that, they were gone.

Undo and redo. In a distributed editor. Have they even thought about how they want that to work? Oh well, no time to dwell on that. Quick bathroom break and then this board meeting. At least I’ve got something to tell the board now.

I give it my best shot. I know the details are a bit rough, and the slides were hurried. Also more than that, there’s just something missing from the plan. But still, I try and deliver the notion of this editor for Alice and Bob and their document to the board.

I finish my presentation. Silence descends for 30 seconds. And then almost in perfect synchronicity they all get out their sharpened pencils and pens (presumably sharp, but not sharpened) and start writing out their reasons for firing me. I panic. I make mistakes no one’s ever made before. I rapidly embellish my plans, promising technology that could never be delivered in the time frame available, all just to placate the board and save my skin.

“Wait! I forgot to mention! It’s not just one document, it’s any number of documents!”

Most pens are still moving, but a pencil or two is stationery.

“Oh! And, erm, erm, it’s not just Alice and Bob, oh no! It’s many many users. Like however many you like. All editing their own documents, in their own groups and teams and stuff!”

A stay of execution may be in reach.

“So like, we could have millions of users and billions of documents, and once we’ve been that disruptive, and have that much data and that many users, we’ll definitely be able to make money from them all and stuff!”

Utter silence. You could hear a mouse fart.

Suddenly, jubilation! Back-slapping, high-fiving, a case of champagne is opened, wild cheering breaks out. I blush, I thank them for their input and for really driving me the extra mile. They assure me they’ve all got detailed notes of everything I just rashly promised and will be cutting my head off if I fail to deliver. “Well that’s only fair!” I joke with unwise joviality. Several of them give me a look that I’ll remember to my dying day. I bid them farewell, until next time, and so forth, and retreat from the room. I spend 20 minutes vomiting in the toilets. What have I just done?

Let’s build!

So, with the scene set, let’s build this thing! I enjoy building distributed systems, I like the challenge. It’s also something that I think needs practice. It’s an area where I think it really helps to understand a bit of theory, and practice thinking about all the different orders in which events can occur.

I’m planning to write several articles on this project over the next few weeks and months. Initially, I’ll be talking a bit about theory and how that informs our protocol design between the browser and the server (and in fact how the undo/redo feature ends up meaning simple HTTP REST calls are not the most appropriate route to take). I’ll cover how we store this document, including its undo and redos (i.e. it’s editing history). There are trade-offs everywhere: so many different ways you could make this work, none of them perfect, all a little ugly in places, but several of them more than good enough. I’ll show bits of the code (which will be open source - at the time of writing this, the code is complete and working, but I’m yet to write tests). And finally, I’ll cover testing this. I will write end-to-end soak tests, and hopefully fuzz tests too. Yup, I want to fuzz test a distributed concurrent system.

Oh yes, and hopefully, what I build will work for any data structure. Not just a document. Should be fun!

Part 2 - Protocols