Programming with async / await

After 15 years of building back-end and middleware server-side software, I currently find myself writing a phone app using Dart and Flutter. I’m a little surprised, but there’s plenty to learn, and some of it is quite interesting. I’ve previously built some fairly complex websites using React with TypeScript, in the single-page-app style. That ecosystem is let down by both the terrible foundation that is JavaScript, and the unnecessarily dreadful and complex tool chains. So although I’m sure there are very important differences between React and React Native which I don’t know about, I nevertheless decided to try out the Dart/Flutter combo this time. But both Dart and JavaScript share very similar concurrency models (async/await), which has also recently been added to Rust.

After spending nearly 8 years working mainly in Go, learning Dart has been interesting. Syntactically it’s much noisier. Lots of extra complexity, keywords, and slightly questionable features. More parenthesis, brackets and semicolons than in Go. More operators to learn. It’s made me really appreciate how sparse and simple Go is (absolutely, there are warts in Go: the nil/non-nil interface issue, and some subtle semantics around slices (amongst others) can certainly be foot-guns and detract from the simplicity of the language). The Go code formatter is much better than Dart’s; for one thing, it allows much more flexibility over blank lines than Dart, which I find essential for writing comprehensible code. That said, I’m very well aware this is just me having to get used to different ideas, and different design decisions. Without doubt, Dart is a much better language than the JavaScript world, and the tooling works without having to sacrifice a whole farmyard of goats first.

There are some absolutely magical things. I can put my phone in developer mode, connect it to my computer, and type flutter run. The app gets compiled on my computer, installed on my phone and started, and then I can use my browser on my computer to debug the app running on my phone, setting breakpoints and all. I’m sure to people who do this every day this is nothing exciting, and I’ve no doubt that React Native can do all this same stuff (and other phone-app-ecosystems too). But to me, it does seem utterly magical; my mind boggles at the millions of hours of effort that must have gone into making this work. Hot reloads/restarts all work too. So the developer experience is all pretty good: the tooling seems to be reliable and extensive.

Entering the Rabbit hole

Because I didn’t start by reading the language spec, this section is going to feature me making a fair few assumptions, and being wrong. I will point them out as we go!

As an aside, although I do enjoy beautifully presented documents typeset with LaTeX, I feel that a language spec should not be a PDF generated from LaTeX. Especially when the language tour is rather poor. For example, as I’ll soon cover, I wanted to check to see if logical operators are short-circuiting. In the relevant section of the HTML Go language spec, it states:

The right operand is evaluated conditionally.

These six words are enough to make the semantics clear. It’s fast to find the text and fairly simple to understand: yes, perhaps it’s a little terse, but it’s pretty good. The relevant section of the Dart language tour says nothing useful on the subject. In the relevant section (which I can’t link to because it’s a sodding PDF document) of the Dart language spec, it says:

Evaluation of a logical boolean expression b of the form e1||e2 causes the evaluation of e1 to an object o1. It is a dynamic error if the run-time type of o1 is not bool. If o1 is true, the result of evaluating b is true, otherwise e2 is evaluated to an object o2. It is a dynamic error if the run-time type of o2 is not bool. Otherwise the result of evaluating b is o2.

Evaluation of a logical boolean expression b of the form e1&&e2 causes the evaluation of e1 producing an object o1. It is a dynamic error if the run-time type of o1 is not bool. If o1 is false, the result of evaluating b is false, otherwise e2 is evaluated to an object o2. It is a dynamic error if the run-time type of o2 is not bool. Otherwise the result of evaluating b is o2.

Yes, it covers more details about the requirements on evaluating the arguments. And it implies that the operators are short-circuiting (although in the both cases it does not state the conditions when the right-hand-side is not evaluated; it could be the right-hand-side is always evaluated and the result is thrown away when not needed). But I find it much harder to comprehend.

English is a dreadful language to specify the semantics of a language (for example, in the above quoted text, I do not feel that the word “otherwise” is an exclusive-or / if-and-only-if). But it’s incredibly important that your users understand the semantics of your language if you want them to trust the behaviour of the compiler/interpreter and be able to build programs. So putting huge effort into your use of English to describe semantics is really very important, and pays off.

Whilst I’m having a minor rant, the fact that the Dart code labs (e.g. async-await) only work if you allow 3rd-party cookies is pretty offensive. It’s 2022. People block 3rd-party cookies. Sort yourselves out!

Anyway, on to some code.

class MyClass {
  MyClass(this.name); // This is a constructor, and it sets the field name.
  final String? name; // The field. The ? means it can be null.

  void myMethod() {
    if (name != null && name.isNotEmpty) {
      // some code
    }
  }
}

Running dart analyze on this code gives an error:

The property 'name' can't be unconditionally accessed because the receiver can be 'null'.

It’s complaining about the name.isNotEmpty code on the right-hand-side of the &&. Because I’m often an idiot, I didn’t follow the link that is printed alongside the error message. Having now done so, it’s not particularly helpful: it suggests how to fix it, but it doesn’t explain why this is an error. To find that out, you have to follow another link to this page.

So the first thought that went through my head is: “huh, is && not short-circuiting?”. I.e. I’m wondering if it’s evaluating the right-hand-side even if the left-hand-side has given false. Hence the journey of trying to find out if Dart’s logical operators are short-circuiting or not. Eventually, I find that they are, as normal, short-circuiting. I also spot that if I change the code as follows then the error goes away:

class MyClass {
  MyClass(this.name);
  final String? name;

  void myMethod() {
    final nameCopy = name;
    if (nameCopy != null && nameCopy.isNotEmpty) {
      // some code
    }
  }
}

The fact that this is now OK suggests that && is definitely short-circuiting: if it wasn’t, then this code should still generate the same error. So next my head turns to concurrency: in the original code, there are (potentially) two reads of the field name. Could the field’s value change between these reads? What exactly are the semantics of this async/await concurrency model?

As far as I can tell, in this particular case, no, concurrency cannot be responsible for the field name changing its value between the two reads. Slightly different code can, as I shall show in a moment. In this particular case though, the real danger is this:

class MyClass {
  MyClass(this.name);
  final String? name;

  void myMethod() {
    if (name != null && name.isNotEmpty) {
      // some code
    }
  }
}

class MySubClass extends MyClass {
  MySubClass() : super('');
  bool _got = false;

  @override
  String? get name {
    if (_got) {
      return null;
    } else {
      _got = true;
      return 'a real string';
    }
  }
}

void main() {
  MySubClass().myMethod();
}

That subclass still has the method myMethod, and when it reads name, it’s now calling the programmatic getter name in the subclass, which has pathological semantics: the second read of name, on the right-hand-side of the && now really will get a different value back. I’m pretty sure at this point that programmatic getters and setters are a misfeature that add only confusion. Cleverness never leads to readable code (he says confidently, having lost count of the number of times he’s written clever code).

Going deeper underground

A while ago, I read the red-blue blog post on callback-hell. Despite generally not liking the futures/callbacks model, the author does accept that async/await makes the situation better than bare futures or callbacks. I personally don’t find the syntax terrible, and I’ve not yet suffered from an “oh dear, this library method I’m relying on has changed from non-async to async and I now need to adjust this huge call chain” situation. What I’ve not read a lot about before though, is the semantics of this concurrency model: how do you use it safely?

For Dart, there is this code lab and this section of the language guide. But I do wish certain details were given a lot more prominence. For example, these two sentences seem to me like something that’s kinda important to know about:

An async function runs synchronously until the first await keyword. This means that within an async function body, all synchronous code before the first await keyword executes immediately.

It would be perfectly understandable (to me!) to believe that as soon as you call an async method, you immediately get a future returned, and all of the code of the async method gets put in a new task on the event queue.

For some reason, in my head, I had the impression that in single-threaded async/await languages like Dart or JavaScript (yes, ignoring Isolates and Web-workers because of the limitations on what you can share between them), you don’t need to use locks or worry about mutual exclusion. It kinda makes sense right? (I refer back to my earlier admission of being an idiot from time to time). Whether I’ve misunderstood, or made my own assumptions, or been misled, I just don’t know. It’s wrong: you absolutely need to worry about mutual exclusion.

If you’re old enough to remember CPUs before the age of SMT, or Hyper-Threading, or multi-core, well it was a very similar situation back then: sure, the CPU could only do one thing at a time (ostensibly), but you could have multiple threads in your program, and because of preemption, you absolutely needed to worry about mutual exclusion: at any point, the OS could decide to stop running one of your threads and start running a different one. A particularly good moment for an OS to do that would be when your running thread starts blocking on I/O, waiting for data to be sent or received. Sounds familiar right?

So every time in my program I’m calling an async method, I’m doing the equivalent of launching a new thread that is being run on a single-core CPU. Every time my program calls await on a future, it’s yielding this CPU and allowing a different thread to run. It’s kinda like a really weird form of cooperative multi-tasking.

Getting back to some code, consider this:

class MyClass {
  MyClass(this.name);
  String? name;

  void myMethod() {
    if (test()) {
      print('Condition was true. $name');
    } else {
      print('Condition was false. $name');
    }
  }

  bool test() {
    reset();
    return name != null;
  }

  void reset() async {
    name = null;
  }
}

void main() {
  MyClass('foo').myMethod();
}

If I run this code, it prints Condition was false. null. Because of those two sentences I quoted above, and because there is no use of await, reset() will be run synchronously (even though it’s marked as async), and so null will get assigned to name before test() returns false and the else-block is entered. But let’s change reset() to do an await before the assignment:

  void reset() async {
    await () async {}();
    name = null;
  }

I’m awaiting on an empty anonymous async function. But it’s enough to push the rest of reset() into a task which is added to the event queue: the name = null is now going to be run in a new thread (I know “not really”, but the semantics are the same: it’s a new thread which is now waiting to be run on our single-core CPU). The test() method is the same as before and doesn’t care about waiting for reset(). If I run this, the assignment of name = null now happens after test() has returns, and so test() returns true, so the program prints out Condition was true. foo. No code has changed in myMethod. If I reversed those two lines of code in reset(), the assignment to name would happen synchronously.

In a nutshell, I have a data race between myMethod() reading name, and reset() writing name. What is needed is a good old mutex:

import 'package:mutex/mutex.dart';

class MyClass {
  MyClass(this.name);
  String? name;
  final _lock = Mutex();

  Future<void> myMethod() async {
    await _lock.acquire();
    if (test()) {
      print('Condition was true. $name');
    } else {
      print('Condition was false. $name');
    }
    _lock.release();
  }

  bool test() {
    reset();
    return name != null;
  }

  void reset() async {
    await _lock.acquire();
    name = null;
    _lock.release();
  }
}

void main() async {
  await MyClass('foo').myMethod();
}

Because myMethod acquires the lock before the if condition, the async reset has to wait until myMethod releases the lock before it can do the assignment to name.

Coming up for air

How can any of this have any bearing on real life, normal non-pathological code?

In my app, let’s say the user taps a button. Some event handler I’ve set up starts running. That event handler makes an HTTP call to my server, which is async. So at this point the rest of the event handler goes into a new task that gets added to the event queue, thus freeing up the thread and the UI can continue to be rendered and respond with low-latency. Before the reply comes back from the server, the user taps on something else, and some other event handler starts running. If both event handlers are mutating the same state, I’ve quite likely got a race condition. It doesn’t matter that Dart/JavaScript is single threaded. That merely means I don’t have parallelism going on. I still have concurrency going on, and that needs to be managed.

If none of this is news to you then I genuinely apologise for wasting your time: this could well just be a confusion in my head caused by not thinking and reading enough in the first place. But I’ve never really heard people who write single-page-apps, or write mobile apps, talking about locks, mutual exclusion, data races, or concurrency. So it’s possible it’s not just me who’s failed to realise all this.

Although the error message early on in this post caused me to start wondering about all of this, it was only when I was reading the docs for bloc that the penny started to drop. Having built one big(gish) thing in React, I can definitely appreciate the problem that BLoC is trying to solve: decoupling building the widget tree from any business logic around your state. Having spent some time working with this model, I quite like it.

Quite early on, it started to feel to me like the BLoC object is rather like an actor: it receives events from various different sources (the user, the network, timers etc), and needs to combine these events to come up with some consistent state that is then made available as the widget tree gets built. Actors are always single threaded things. They process one message completely, updating their own state as necessary, before going back to their mailbox and getting the next message. But in the docs I started coming across statements like:

By default, events will be processed concurrently.

“Huh?! What does that mean? This is single-threaded isn’t it?!” And so slowly I dug in, and started to learn, and remember. In the specific case of flutter’s bloc, there’s a good blog post on this subject which shows it’s even more subtle than I anticipated. For bloc, I don’t need to explicitly use mutexes: there are other mechanisms available. Whether you’re using bloc, or Redux or something else, the key thing to think about is this: when handling one event, if your code awaits a future, does that allow the next event to be dequeued and start being processed? If it does, then you might be in need of some locks, or other changes.

In general, I don’t think you can avoid having to protect all mutable state in some way, because as soon as you await anything, you’ve yielded the CPU. So any event handler which reads some state, sets off some async action, awaits the result of that and then mutates state, has to be prepared for the possibility that everything mutable it read before the async action, may have changed in the meantime. To be clear, I’m absolutely not suggesting that it’s necessary to block UI rendering whilst you wait for some network request to complete: it’s not necessary. But I am saying that the interleaving of preempted and partially complete event handlers can easily lead to data races and nonsensical state if you don’t take care to create regions of mutual exclusion where necessary.

Thus I don’t think async/await spares us from any of the dangers of concurrency. I’m not sure anyone really claimed it did, it’s more that I’ve just not seen them manifest before. At least as far as Dart and JavaScript are concerned, it merely limits parallelism. It’s probably not the case that a correctly written Dart/Flutter program would still be correct if every task really was run in parallel in its own thread on a multi-core CPU, but I don’t think it’s particularly wide of the mark. Although I accept this may just be my experiences/biases talking, I personally would rather see explicit rather than implicit management of threads, and ideally green-threads as in Go/Erlang etc. Still, having now learnt more, hopefully there’ll be fewer surprises to come.