Reactionary Visions

Sat, 17 Jan 2015

Free speech values

In the aftermath of the attacks on Charlie Hebdo in Paris, there has been some high quality thinking and writing. There's also been some really stupid things said, from the usual protagonists. It's an interesting facilitation that the internet now provides: as I no longer watch any news on TV (in fact I don't watch any news at all), nor subscribe to any newspaper, I'm used to reading articles from a wide range of sources. Equally, it's much easier for me to avoid opinion I disagree (right-wing press) with or trivialised dumbed-down reporting (e.g. BBC news). Because of this ease of reading what you want to (in both the good and bad sense), I thought a lot of the reaction was measured and sensible. Turns out I was just unaware of most of the reaction going on.

Anyway there seems to be virtually nothing left to say on this, so this post is really little more than a bunch of links to pieces I thought were thoughtful and well written.

I am an agnostic. I personally don't believe in any religion but I accept I can't prove that every religion is false and so I may be wrong. I tend to treat the beliefs of any religion as arbitrary and abstract ideas. Thus one place to start is the acknowledgement that the laws of any country or civilisation are as arbitrary and ad-hoc as the rules or teachings of any religion. They are just things that people choose to believe in, or follow, or not violate. In the UK, some of our law is based in Christianity (e.g. thou shalt not murder - though I've no idea whether ideas like that actually predate Christianity; I wouldn't be surprised if they do) though other parts of Christianity are not encoded in law (adultery is not illegal, for example).

Many have careless labelled these attacks as an attack of free speech and thus that the reaction is about defending free speech. As such, much has been written about how it's possible to defend the right Charlie Hebdo has to publish anything they want, even if it's offensive, even whilst criticising their choice of content.

Freedom of speech is, I believe, essential for any sort of democracy to work. This doesn't suggest that if you have freedom of speech then you have a democracy (I don't believe in the UK, or arguably anywhere in Europe or north America there is functioning democracy; merely systemic corporate dictatorships disguised by corrupt elected faux-representatives), but without freedom of speech, you certainly don't have any ability to transparently hold people to account and thus corruption and abuse will obviously take hold. But freedom of speech is a choice, it's not something axiomatic to existence; it's something many people have chosen to attach huge importance to and as such will defend. But just because there are widely agreed reasons to defend the concept of freedom of speech doesn't mean that it's self-evidently a "better" idea than not having freedom of speech. To judge something as "better" than something else requires all manner of discussion of the state of human existence. Consequently, criticising people for not holding freedom of speech in the same regard as we claim to do is no different from criticising people for their choice of clothing, or housing, or diet, or career, or religious views.

Of course in the UK, we don't have freedom of speech as an absolute concept nor does most of Europe. In the UK, "incitement to racial hatred" was established as an offence by the provisions of §§ 17-29 of the Public Order Act 1986, and The Criminal Justice and Public Order Act 1994 made publication of material that incited racial hatred an arrestable offence. It's not difficult to find stories of people getting arrested for saying things on twitter, the easiest is of course making bomb threats. UKIP (an extremist right wing political party in the UK) even managed to get police to visit a man who'd posted a series of tweets fact-checking UKIP policies. Thankfully, he wasn't arrested.

The USA appears to hold freedom of speech as a much more inviolable concept. For example:

The ACLU vigorously defends the right of neo-Nazis to march through a community filled with Holocaust survivors in Skokie, Illinois, but does not join the march; they instead vocally condemn the targeted ideas as grotesque while defending the right to express them.

But whilst the outpouring in Paris and the crowds gathered as a statement of unity were warmly defiant, it is somewhat likely that rather more than physical violence that was being defied, and more than freedom of speech defended by the crowd. As Brian Klug wrote:

Here is a thought experiment: Suppose that while the demonstrators stood solemnly at Place de la Republique the other night, holding up their pens and wearing their “je suis charlie” badges, a man stepped out in front brandishing a water pistol and wearing a badge that said “je suis cherif” (the first name of one of the two brothers who gunned down the Charlie Hebdo staff). Suppose he was carrying a placard with a cartoon depicting the editor of the magazine lying in a pool of blood, saying, “Well I’ll be a son of a gun!” or “You’ve really blown me away!” or some such witticism. How would the crowd have reacted? Would they have laughed? Would they have applauded this gesture as quintessentially French? Would they have seen this lone individual as a hero, standing up for liberty and freedom of speech? Or would they have been profoundly offended? And infuriated. And then what? Perhaps many of them would have denounced the offender, screaming imprecations at him. Some might have thrown their pens at him. One or two individuals — two brothers perhaps — might have raced towards him and (cheered on by the crowd) attacked him with their fists, smashing his head against the ground. All in the name of freedom of expression. He would have been lucky to get away with his life.

It seems that some things you can say and governments will try and protect you. It would appear in much of Europe that blaspheming Islam is legally OK. But, as noted above, saying other things will get you arrested. The French "comedian" Dieudonné was arrested just 48 hours after the march through Paris on charges of "defending terrorism". Whilst not arrested, the UK Liberal Democrat MP David ward tweeted "Je suis Palestinian" during the Paris marches and criticised the presence of the Israeli prime minister, Netanyahu, subsequently eliciting a complaint from the Israeli ambassador to Britain. Of course the "world leaders" who gathered in Paris have a wonderful record themselves on "protecting" free speech.

Charlie Hebdo did not practise satire by mocking select, specific targets, nor did they mock all major religions equally. It's been widely reported that they sacked a cartoonist for making an anti-Semitic remark and that:

Jyllands-Posten, the Danish newspaper that published caricatures of the Prophet in 2005, reportedly rejected cartoons mocking Christ because they would "provoke an outcry" and proudly declared it would "in no circumstances ... publish Holocaust cartoons".

But of course it comes down to the content of the publication. In this case the cartoons exist to ridicule, make fun of and offend members of one of the world's largest religions, Islam, by mocking their prophet Mohammed. As Amanda Taub writes:

Dalia Mogahed, the Director of Research at the Institute for Social Policy and Understanding, explained that Mohammed is a beloved figure to Muslims, and "it is a human impulse to want to protect what's sacred to you".

Mogahed compared the cartoons to the issue of flag-burning in the United States, noting that a majority of Americans favour a constitutional amendment to ban flag-burning for similar reasons: the flag is an important symbol of a national identity, and many Americans see flag-burning as an attack on that identity, or even on the country itself. That's not extremism or backwardness; it's about protecting something you cherish.

In any large group of people, there will be the vast majority of sound mind and thought, and a small minority who are not. This is just the fact that all over the earth, humans are not all the same: there is some variance in health, intelligence, and every other aspect of what a human is. Any large sampling of humans will show the same set of variations. So if you offend a huge group of people, you will offend tall people and short people, rich people and poor people, fat people and thin people, violent people and peaceful people. Unsurprisingly, it would appear that the background of these killers suggests there is little to do with Islam there, and more to do with the their upbringing, family, education and integration with society.

Thus even if you feel Charlie Hebdo's publications of the cartoons served some purpose (given their biased choice of target, their purpose does not seem to be an exercise in itself of freedom of speech), it should be obvious that by offending so many people, they were placing themselves in danger. The same is true of any sustained, systemic, deliberate offence to any of this planet's major religions, races, nationalities or any other grouping of humans which share values. So it becomes a balancing act between how much do you believe in the message you're publishing versus the risk you're putting yourself in. You can view the actions of Edward Snowden in this same context: he felt that the message he was delivering on the abuses of surveillance power carried out by governments across the world outweighed the significant danger he was putting himself in, and so both delivered the message and accepted the need to flee from his country, probably never to return, in fear of the consequences of his actions.

Thankfully, throughout history, there have been people who have chosen to put themselves in the path of great harm (often losing their lives as a result) in order to report, expose, document and publicise matters which the wider public needed to know. Governments, monarchies and empires have crumbled when faced with popular revolt.

So freedom of speech requires consideration. It is perfectly reasonable not to say something because you anticipate you won't enjoy the consequences. Most of us do not conduct our lives by going around saying anything and everything we want to our friends and family: if we did, we'd rapidly lose a lot of friends. The expression "biting your tongue" exists for a reason. Equally, it's perfectly reasonable for a news outlet to decide not to re-publish the Charlie Hebdo cartoons if they fear a violent response that they suspect the local police forces cannot prevent; not to mention just not wanting to offend so many people.

I view as daft the idea that people should never choose not to publish something out of fear. People absolutely should choose not to publish, if they feel the risk to the things they hold dear is not outweighed by the message they're delivering. Everything in life is a trade-off and every action has consequences. Whilst I agree with the right to free speech, that does not imply saying anything you like is free of consequences. If it were, it would require that words have no meaning, and subsequently all communication is void: if anything you say has no consequence then you can say nothing.

I am certainly not suggesting the murders were in any way justified, or that Islam or any other religion or collection of humans should be beyond criticism or even ridicule. At the end of the day, no human is perfect, and as such we can all benefit from a thorough dose of criticism once in a while. Every article I've linked to in this post repeats that such violence, regardless of the provocation, is abhorrent, and I agree with that: murder is never an acceptable response to any drawing, written or spoken word. But that doesn't mean that these events weren't predictable.

Finally then we get to the insanely idiotic response from the UK government. That MI5 should have more powers that they don't need (they probably just need more money), and that we must deny terrorists "safe space" to communicate online. Which means banning encryption, which means it's impossible to use the internet for anyone. The home secretary, Theresa May said:

We are determined that as far as possible there should be no safe spaces for terrorists to communicate. I would have thought that that should be a principle ... that could have been held by everybody, across all parties in this House of Commons.

So of course, if terrorists can't communicate in private then no one can. Quickly, we've immediately gone from lazy labelling of events as an attack on free speech to a knee jerk response of "free speech yes, but you certainly can't have free speech in private, because you might be a terrorist". Again, it's a trade-off. I doubt that having such restrictions on communication will make the earth or this country safer for anyone and of course the impossibility of a controlled study means it cannot be proven one way or another. No security service is ever going to be flawless and from time to time very horrible things will continue to happen. I think most people are aware of this and accept this; we're all going to die after all. The loss of civil liberties though is certainly far more worrying to me.

In theory, I would think these proposals so lunatic as to never see the light of day (it would be completely impossible to enforce for one thing - terrorists along with everyone else would learn to use stenography to encode their messages in pictures of cats, thus rendering their traffic no different to that of everyone else). Sadly, Labour have stated they don't believe their position to be that far away from the Conservatives, which is deeply worrying. Labour don't exactly have a great record on this area either given their previous ID card schemes and the introduction of detention-without-charge. What is needed is some transparency. We need an informed debate, with MI5 and GCHQ providing some independently verifiable facts and figures that demonstrate how they are being thwarted in what they're trying to do. We need to understand properly what the risk to us is, and most importantly we need to understand why these threats exist and what else we can do to make them decrease.

I've never seen it said that any UK Government policy in the last 15 years has made the UK less of a target for such attacks. Maybe we should look at that before we start subjecting ourselves to Orwellian control.

posted at: 17:35 | path: / | permanent link to this entry

Sun, 19 Oct 2014

Anonymity in public life

In an article published in the Guardian yesterday, author Kathleen Hale recounts how her first book got some negative reviews by reviewers on a book review website. One reviewer in particular upset her and Kathleen ends up figuring out the reviewer is using a false identity, finds out who the reviewer really is and confronts her. The piece doesn't read to me like some sort of valedictory "I outed a fraud" type piece (though there are some passages in there which are questionable in that direction) and equally there are several passages where Kathleen expresses deep embarrassment and regret for the course of action she took. This episode, and that article in particular has caused substantial reaction: currently 600 comments on the Guardian article plus several other blog posts. There's no shortage of opinion to be found on Twitter either, as you'd expect.

The course of action that Kathleen took seems to be fairly undisputed as far as I can find. There is some dispute from some of the other blog posts as to exactly what was tweeted and said by whom, and there is dispute over Kathleen's claim that there are factual inaccuracies made in a review of her book. It is not disputed that the reviewer was using a false identity and that the reviewer had at least public Twitter, Facebook, and Instagram accounts under the false identity. The false identity was also a real name (Blythe Harris), by which I mean a name which if you introduced yourself by that name, no one would think you're using a false identity. This is distinct from claiming to be Peter Rabbit, or Buzz Lightyear.

Many people have equated Kathleen's actions with stalking. My dictionary defines the verb to stalk as:

  1. to follow or approach (game, prey, etc.) stealthily and quietly
  2. to pursue persistently and, sometimes, attack (a person with whom one is obsessed, often a celebrity)
  3. , 4,... [not relevant]

The second item there certainly fits. The British legal approach, whilst it gives no strict definition gives examples and guidance:

....following a person, watching or spying on them or forcing contact with the victim through any means, including social media.

The effect of such behaviour is to curtail a victim's freedom, leaving them feeling that they constantly have to be careful. In many cases, the conduct might appear innocent (if it were to be taken in isolation), but when carried out repeatedly so as to amount to a course of conduct, it may then cause significant alarm, harassment or distress to the victim.

I'm glad it includes "social media" there. Some comments have suggested that stalking "in real life" is worse than online. This seems bizarre to me: as if through a computer you are not interacting with other human beings but merely with shiny pixels who have no emotional capacity. "In real life" is everything we know. Whilst we're alive we have no personal experience of anything other than "in real life".

So I'm fairly sold on the whole argument that Kathleen's behaviour towards this reviewer can be considered stalking and as such is reprehensible.

To me, the far more interesting issue is the use of anonymity, false identities and any realistic expectation we have of privacy on the internet. A number of people who claim to write book reviews on such sites have suggested that the behaviour of Kathleen is exactly why they write their reviews under false names. I think there's something of a contradiction going on here.

But let's work backwards. Firstly, Kathleen, through some social engineering (she requested from the book review site the address of the reviewer so that she could post her a copy of the book) got the address of the book reviewer. She then used a telephone directory and census results to identify who really lived there (or likely owned the land). Now the use of the telephone directory seems a bit odd to me: telephony directories map names to numbers (and maybe addresses). Yes, you could use it to map an address to a name but it's very inefficient: you're essentially searching through the whole directory looking for the address whilst the directory is sorted by name, not address. So unless it was a very small telephone directory, I don't really buy that. Using census results is far more creditable: they're public documents and when they're online, they do allow you to search by address. In the UK you can only get access to the raw census details 100 years after the census has been published which, to a high probability, rules it out as a means to tie an address to a person who's still alive. You can get statistics and aggregates from more recent census results but you can't get the raw data. I'm assuming that in the US there's no such restriction on access to raw census data. If there is then I don't understand how Kathleen really managed to get a name for the owner of the property.

Instead, in the UK, if you want to find out who owns some land, you can pay the land registry £3 and they'll tell you. Presumably there are means by which you can legally hide this; I'm sure the rich have figured this out - probably some method by which some fake company in a tax haven technically "owns" the land and as they're registered abroad, they don't have to divulge any further details about that company. So yes, you could argue the Land Registry is profiting from facilitating stalkers, but equally there are a bunch of legitimate reasons to need access to such data and I can't think of any sane way to ensure the use of such a service isn't abused. So from that I conclude that unless the owner is a millionaire, the owner of any land is public knowledge.

The use of social engineering to get the address in the first place is more interesting but very obvious. This sort of thing happens a lot and sometimes to horrifying consequences (e.g. the Australian DJs who phoned up a hospital, pretending to be the Queen and Prince of Wales, enquiring as to the health of the Duchess of Cambridge. The nurse fell for the hoax and put the call through. Three days later, the nurse committed suicide). As a species we are not good at taking the time to verify who we're talking to or why. Whilst (hopefully) most of us would hang up if our bank apparently rang us and then asked for our credit card details "for security" this is largely only because it's in the bank's interest (in terms of cost of insurance) to reduce fraud, so they've trained us as such. But in all sorts of other scenarios we implicitly trust people we've no real reason to. A simple example: ticket inspectors on public transport. They may be wearing the uniform, but it could be faked. With their travel-card readers they could be seeing who has the expensive yearly travel cards, scanning the unique numbers from them and then using them to program up fraudulent cards. The crypto on those things is notoriously weak. Has anyone ever requested some means to verify the identity of a ticket inspector? And even if you could, how do you know they're not crooked regardless?

So phoning someone up, impersonating someone else, or pretending to have valid reasons to request the information you're requesting is always likely to work. It might be illegal in some cases, but it's certainly human nature to try to be helpful and if you're given a plausible justification, on what basis could you refuse the request unless it's contrary to some sort of company policy? In this case, if you're concerned about anonymity, wouldn't you be concerned about this possibility, and make use of an anonymous mail box?

Article 8 of the European Convention on Human Rights guarantees an individual's right to respect for privacy and family life, including correspondence. Is privacy the same as anonymity? No, definitely not:

In conflating anonymity and privacy, we have failed to see an important factual difference between them: under the condition of privacy, we have knowledge of a person’s identity, but not of an associated personal fact; whereas under the condition of anonymity, we have knowledge of a personal fact, but not of the associated person’s identity

The vast violations of our lives by state surveillance as revealed by Snowdon over the last year demonstrates the whole-scale collation of everything we do online and off by our governments. This is both being able to observe an action and identify the individual who caused it (thus we have no hope of performing any action anonymously), and being able to observe an individual and know the actions they take (thus no privacy). I can't work out whether the ECHR has anything to say on a right to anonymity; I get the sense that it doesn't try to protect that. So that's basically saying: "the state shouldn't record your every move (as that's an invasion of privacy), but moves that we're interested in, we can know who did them". Of course, we now know they're recording everything anyway.

We also know that computer systems can always be hacked into - there is no real security anywhere. Given a skilled and sufficiently funded adversary, any computer system connected in any way to the internet can be hacked into. Why? Because humans wrote the software that runs on those computers and humans are incapable of writing bug-free software. Look at all the large scale data breaches in recent history. Nothing is secure.

So we have laws that seem to try and protect privacy, but they're violated by our own governments, and in any case, we have countless examples of our inability to store any information securely. So is there really any hope to be able to exist with anonymity on the internet?

As ever, it depends who your adversary is. If your adversary is a government (either your own or some foreign government) then no, you have no hope. If it's a previous partner of yours who has no particular computer training, then yes, you're probably going to have a reasonable chance of being anonymous for a while. But you need to read up on this and think hard: it's not a trivial undertaking. There are some good guides as to how to do this, but:

All writers - whether writing under their own names or not - should be aware of the risks they may incur by hitting 'publish'.

What is the effect of hitting "publish"? It's to put more data points out there which may lead people to be able to identify you. The fewer data points out there, the better. So coming back to our book reviewer, if you want to review books anonymously, and if your justification for acting anonymously is to avoid being stalked by authors who don't like your reviews, then why put so many data points out there? Why have the Facebook page, the Instagram profile with the faked photos, the Twitter account? Why give your real postal address to the book review club knowing they're going to post books to it and might conceivably give your address out to other people?

The social media accounts in particular I find most odd. If you want to review books then review books. Build your following, your reputation and cachet on the quality of your reviews. If I'm looking at a book review I really don't care where you went on holiday, what your tweets are, or how many pets you have. Putting that information out there undermines your entire justification for being anonymous: if you want to be anonymous (i.e. you don't want people to find out who you are) then why are you putting so much unnecessary information out there that may allow people to figure out who you are?

Equally, use a name that clearly communicates to me you're trying to be anonymous: call yourself TheBookReviewer53, DostoyevskyLover or OrwellWasRight. Doing so doesn't lessen the validity of your opinions on your chosen subject and is more honest with people reading your reviews: it's overtly saying "I have reasons to want to exist anonymously on the internet". It reveals nothing more about your real identity either: regardless of the obvious fictitious-ness of your online persona, if you can be found, you can be found.

Researchers show that four data points about a person’s location can identify that person with 95% accuracy. FOUR. You think you can tweet anonymously from your phone? You think apps like Whisper allow you to act anonymously? As with pretty much everything related to the internet and computing, unless you've spent the last 20 years of your life working with computers, studying computers and thinking very hard about threat models and what data you're putting out there, and are utterly paranoid, you basically haven't got a chance. Do you turn off wifi on your phone when you leave the house? You should. You trust that USB pen drive you're transferring documents on? You shouldn't.

Finally and most obviously, any attempt at anonymity clearly doesn't insulate you from the law. As members of various hacking groups such as lulzsec found out, you always can be found out by law enforcement agencies. Yes, you might be able to make it difficult for a poorly funded person to come after you for libel (which is really just an indictment of the disgusting relationship between justice and money) but it's quite a risk to take. If you wouldn't put it in print with your real name attached, you're placing an awful lot of trust on your ability to maintain your anonymity against an adversary you probably don't know as well as you need to.

posted at: 16:00 | path: / | permanent link to this entry

Sun, 05 Oct 2014

Programming in the real world

One of the things that annoys me about Object Oriented Programming is how it's often suggested that it models the "real world". Frequently tutorials will start with creating an object modelling a chair, and through inheritance you'll be able to build up composable aspects of chairs: different numbers of legs, different colours, different designs. Sometimes they use tables rather than chairs. This is lovely, but it actually has everything to do with data modelling through inheritance, decomposition, abstraction and encapsulation, and almost nothing to do with Object Orientation: the key is that these chairs have no modifying methods on them. If they have any methods at all then they'll be for things like getting the number of legs or the colour, or volume or something - something that is basically fixed once the object is instantiated. At this point in such tutorials I'd probably claim this is not actually programming yet: all that's been achieved so far is that we've assigned some semantics to some numbers held in memory and we can write some numbers in memory. Programming is when we manipulate numbers: that involves reading and writing numbers.

The problem then is that Object Orientation immediately stops being about modelling the "real world" as soon as we can modify memory. If we think about how we actually would go about getting a chair made for us, it could go a bit like this:

  1. Go see your local carpenter,
  2. Have a discussion with them about the style and type of chair you'd like,
  3. They make the chair and give it to you in return for payment,
  4. You take the chair home,
  5. You decide you don't like the colour so you take it to your garage and repaint it yourself.

It should be clear that the inanimate object (the chair) is the odd one out here. Everything else is done by actors that have their own state, mainly act asynchronously, and can communicate with other actors through protocols - protocols that do not involve sharing mutable state (e.g. if I say something to you, that speech is immutable; you can't change what I've said (though you could choose to mishear me!)). At no point is any state of any actor actually exposed to another actor: I may share with you what I'm currently thinking, and you can try to influence me, but we don't exactly need a mutex around memory in my brain because YOU'RE NOT GETTING IN THERE!

If you tried modelling this sort of thing through Object Orientation without actors then you'd end up with your own thread doing all the work: it'd be you, it'd be the carpenter and it'd be the chair, maybe all at once. If your carpenter is in fact a growing business with a receptionist, a design team and a billing department your thread would be playing those roles too and would probably have to use locks to avoid unexpected interactions with other threads doing the same commissioning-receptioning-designing-constructing-delivery-repainting dance. And all the time, whilst you're doing the carpentry yourself, you'd could easily have your own thoughts, feelings, aspirations and regrets all on the same stack for your carpenter-alias to mess with.

Thus Object Orientation causes multiple personality disorder.

So in my view, the way Object Orientation gets introduced tends to be more like "useful tools for modelling data". But the OO approach to manipulating that data goes wrong as soon as you try to model the animated real world. Firstly it has nothing to say about separating out threads to self-contained actors (but try this in a language or on a platform without green-threads, or without the ability to preempt threads and you can quickly hit pain), and secondly even if you do have actors, OO encourages the sharing of mutable data rather than passing around either immutable data or copies of data. Yes, good programming discipline can result in sane designs and a successful result, but it's not a core aspect of the OOP mantra.

So, OOP has nothing good to say on manipulating data at all - it either says nothing or it encourages silly ideas like using locks. The data modelling bits are fine, but I think they're a broader concept beyond the confines of OOP. What else does OOP get you? An arbitrary restriction on the receiver of any method. That's about it. It's thanks to this restriction that writing combinators like cons on a list library in an OO language is really painful.

This week Erik Meijer wrote an article called The Curse of the Excluded Middle: "Mostly functional" programming does not work. After an introduction, we get onto The Problem, which (paraphrasing) is that languages that are mainly imperative but offer some features from pure functional languages are not as safe as pure functional languages.

The first three examples, from C# are certainly surprising to me (I barely know any C# at all though). The first two problems come from trying to compose side-effecting stuff with laziness. In the first case it's not clear that the problem is with the IO operation (printing things out) or actually with the laziness, but more the odd behaviour of the Where operator (presumably the implementation of Where doesn't know that a Cartesian product isn't necessary, but surely any normal monadic/list-comprehension implementation wouldn't have this problem?). The second case is certainly the terrifying composition of laziness with throwing exceptions and thus the exception having the potential to pop out anywhere where the lazy expression gets forced. However, if you know the Select operator is lazy, it's not really that surprising. It's arguably piss-poor language design that there's nothing there to help you, but C# doesn't have checked exceptions; apparently programmers don't like having to deal with errors so you reap what you sow. The third case is how C# has a nice using feature which binds a resource to a lexical scope. But if you construct a closure capturing the resource and then send it out of that lexical scope then using goes wrong (it will still discard the resource even though there's a reference to it within the closure which remains in-scope). This is certainly piss-poor language design: if the closure captures stuff from your lexical scope and you're not reference counting (or equivalent) your lexical scope then YOU'VE DONE IT WRONG. This is as bad as in C allocating stuff on your stack and then returning pointers to it.

Next he moves on somewhat tangentially to the point that if object creation is an observable action then you can't optimise it out. I'm not sure anyone outside a pure functional language specialist would ever want object creation to be optimised out, but the point is that if your constructor has side effects or can in any other way be observed then you can't have your language runtime do memoization of object creation. Doing side effects in object constructors has long been discouraged: I first read that back in the Effective Java book about a decade ago and I'm sure it wasn't exactly a ground-breaking piece of advice then.

So far then we have that side effects which are untracked have the potential to be bad: whether it's printing things out, or throwing exceptions, or discarding resources early, or preventing compiler optimisations. But next I feel the article goes a bit wrong. He first moves onto how channels in C⍵ can store state so they're not pure either, thus bad. And then goes onto how in Erlang you have the same problem as you're just modelling mutable state in actors:

Note how this Erlang actor basically encodes an object with dynamic method dispatch using the pattern-matching, message-sending, and recursion primitives of the language, which you may happily leverage to implement mutable references, sabotaging the fact that the Erlang language does not natively expose mutable state.

This is wrong: you cannot implement mutable references in Erlang. Data is immutable in Erlang so if you send some value out of an actor, you are sending that value. Not a reference to a value or variable. Even if you create a closure and send that out of the actor, the closure is capturing those values as they exist at that point in time. If you have received a value sent to you from an actor, you may use it to create other values, but doing so does not affect the "original", and similarly, the actor itself can continue to modify its own state, but it does not affect the values it sent to you. Yes, you can use Erlang actors to model objects. But an actor's own modifications of its state cannot be observed as side effects on values you've previously retrieved from that actor, and vice versa.

The reference you have to an actor is a process identifier (also immutable) which does not present any information itself about the state of the actor. Through that, you can send messages to an actor and test whether or not the actor is still alive, but that is all. And in any case, where has the sudden objection to mutable state come from? State is just a catamorphism on prior inputs. State is not the problem: unconstrained side effects are the problem. Certainly sharing mutable state is a problem (and you could argue that mutating shared state is a side effect and that it should be tracked statically), but Erlang does not allow for that.

He may have been better off going for an example of opening an file, sending the file handle to another process and then closing the file handle before it's been used (i.e. the same as the third C# example). Except:

  1. All file operations can return an error anyway so handling errors in such code is completely normal;
  2. In Erlang a file handle is normally an actor itself, so what you're doing is passing around a process identifier. Sending messages to a dead process (once the file is closed) is a normal activity and you can detect if the process has died in normal ways;
  3. If you bypass such normal file handling for performance reasons and open the file in "raw" mode then Erlang has a light form of object capabilities in which only the process that opened the file is allowed to use the file handle, so again the use of the file handle would error predictably;
  4. The language doesn't have the same C# feature for discarding resources once you return out of a lexical scope. Consequently closing a file is an explicit operation and given the asynchronous concurrent mindset one develops when working in Erlang, it's very likely you'll realise how odd it is to be closing a file handle whilst there's some closure out there which may not have been run yet.

Beyond this, he introduces the Haskell type system and explains that it captures side effects statically. As a result, by bowing to the demands of the type checker, it offers you a proof that if such effects occur, your program will handle them: exceptions will not go uncaught, IO operations are only permitted where the semantics lead to expected outcomes, resources are not used after they're discarded and the compiler can use all these proofs to do all manner of optimisations to your program.

These proofs can certainly be very valuable (though they are no substitute for disciplined, high quality design and careful implementation). Obviously, they don't capture everything though. Particularly relevant for concurrent and distributed programs, they don't capture sufficient side effects to allow for a proof of the absence of deadlocks. Haskell standard libraries contain channels and semaphores which can easily be used to sporadically end up with a deadlock between processes. A deadlock is definitely a side effect: the effect is the program probably stops working. The cause is an insufficient use of locks to control the scheduler (be it scheduling of OS threads or language runtime scheduling of green threads).

More broadly, the proof a type checker offers is that the specification you've provided (type signatures) is not violated by its inferences about your code. Until the type checker allows "and makes progress" as part of a specification, Haskell itself is no safer than any other language that claims to be "mostly functional".

posted at: 10:22 | path: / | permanent link to this entry

Sat, 27 Sep 2014

Reading highlights this week

I remember when I first went to University, coming across people who were both clearly extremely expert in their fields, from whom I wanted to learn, but were also religious, and how this utterly baffled me. At that point I would cheerfully describe myself as an avid atheist. My ignorance and naivety was somewhat extensive.

Over a decade later I like to think I have a more nuanced view. The most recent war in Gaza led, obviously, to vast amounts of suffering but some excellent articles on the subject (this one by Hadley Freeman in particular) helped me see perspectives more clearly and articulated how crucial it is to be precise with criticism: are you criticising a religion, a people, a government, a policy or something else? Nothing is ever black-and-white and it seems increasingly important to anticipate the consequences of an ill-thought-through comment or reaction. A good example of that is George Galloway's comments this week in the debate about this country once again getting involved in Iraq. On the face of it, and certainly without being remotely well-enough informed to evaluate the accuracy of his claims, if his claims on the size and makeup of ISIS/ISIL are true then there seems little likelihood that the bombing campaigns being discussed will be effective, and quite likely counter-productive. But all of that got lost due his description of Iraqis as quiescent. The way in which that description was seized upon by other MPs and the resultant media storm resulted in the over-shadowing not just of the rest of his contribution to the debate, but also of other important aspects of the debate, such as the resignation of Rushanara Ali (Labour's Shadow Minister for Education), citing once again the lack of a credible long-term plan for the region and our involvement.

Addressing the broader and somewhat more abstract issue is this enlightening article by Karen Armstrong. Again, I'm not claiming to be expert in the area, merely I found the article very educative. It had barely occurred to me that the western world's separation of the secular from the sacred was firstly such a recent occurrence, and secondly that it arose from a specific set of circumstances. There is no implicit reason why separation of state from church is an inevitable or even likely happenstance (to me, this reminds me of the question "if humans evolved from monkeys, then why can't we find monkeys still evolving into humans today?", to which the answer is "the circumstances are not right for that to occur"). The fact that the English word "religion" can't really be translated accurately into other languages (especially not languages that predate English such as Greek or Latin; as historically faith is all encompassing of life, not merely a private affair as we treat it today in the west) starts to show quite how odd the separation of secular from sacred in the modern west really is.

More interesting still is the observation that in the west, belonging to a Nation has in some ways subsumed the role of belonging to a Religion, only apparently with more positive overtones: we consider it almost reprehensible to die for your religion, but honourable to die for your nation. It would seem the concept of even belonging to a nation and having any sense of greater community outside your immediate surroundings only came about with the increased ability of governments to engage with (or intrude upon) their citizens. Before that point, presumably with church attendance widespread and frequent, one's interaction with "the wider world" was through the representative of the church. This would seem to explain a lot about why governments of the past sought the blessing of their nation's church for particular courses of action: maybe the church was seen as the bridge between the government (or monarchy) and the people. The whole article is worth a read.

posted at: 15:08 | path: / | permanent link to this entry

Sat, 20 Sep 2014

Concurrency, Actors, Locks and mailboxes

Having been part of the original team that wrote RabbitMQ (indeed I wrote the very first prototype, back in the summer of 2006), and having worked full time with Erlang since 2009 until fairly recently, it's been interesting doing some work in Go recently.

Go's a language I currently have mixed feelings for. In some ways I like it - it's simple, it doesn't take too long to learn, the syntax is clean, the tool chain is superb (compilation is quick), performance is very good, it's very easy to drop into C whenever you really need to. It also has one very nice high end feature: first class channels - you can pass channels over channels which is pretty powerful. But equally, it's statically typed and has a terrible type system (i.e. no generics. Personally, I don't feel like the proofs offered by type checkers are worth much to me so I'd much rather have no static type checking than one that is this primitive and brain dead), it's not extensible properly (e.g. you can't create your own data structures which work with the range) and worst of all, there's no pattern matching. The lack of pattern matching is particularly horrific given Go's "best practise" of returning tuples from functions (except they're not really tuples - another mistake in the language), the right most of which indicates an error or success. But you can't pattern match on the assignment so you end up with endless if-statements that explicitly check the error for nil. There are other irritations which I've found, particularly related to its locks package (i.e. non-re-entrant; can't upgrade read to write; waiting for a write blocks all attempts to gain reads. Yes, I know I'm free to implement my own locks package if I want to).

Go also doesn't push you to using actors - you have to re-implement all that yourself if you want to. In a recent project, I started off with some locks and within about three days found it utterly impossible to reason about which go-routines can hold which locks and whether or not there's any deadlock potential. Inevitably, there was. So I ripped all the locking code out and wrote my own actor loops.

This was quite interesting as here I could now be more flexible than Erlang. I think most people think that actors mean "only one thread/process/routine can read and write its state" - there is a key concept of owning that state and regulating access to it. However, what I found is that I actually only needed to limit modifying the state to a single go-routine: each actor-receive-loop routine would take a write lock on its state whenever it needs to modify its own state, but it's perfectly reasonable to have anyone read the state, provided they take a read lock before doing so. The fact we can share pointers in Go makes this possible, whereas it's impossible to do this in Erlang (well, not quite - if you use ets then you can do it, which is exactly what we do in RabbitMQ in the rabbit_msg_store - but it's certainly not pretty!). So now we can have concurrent reads and no need to pass read-requests over a channel/mailbox. This seems pretty nice to me.

Recently I was reading a paper and it suggested that:

In message passing systems, processes interact exclusively by sending and receiving messages and they do not have access to shared memory.

Firstly, on a very technical note, they do have access to shared memory - the mailbox or queue is exactly that. The key reason why it leads to more composable systems is that when you hold the lock to write into a mailbox, you can never do anything other than write into that mailbox - you can never try to acquire multiple locks, so you can't deadlock in this way. And that's even assuming you're using locks for mailboxes - queues make lovely structures for lock-free concurrent access.

Secondly, as I suggest above, it appears to be safe to allow multiple concurrent readers of an actor's state, provided modifications to the state are done atomically by the actor thread - though more care has to be taken now to ensure updates are consistent - you have to make sure you update all the state you need to change in one go under a write lock (the sort of transactional semantics you end up needing to ensure makes me heavily think of STM). Whilst I would probably still call such a system a "message passing system" I can certainly imagine others would disagree and at a minimum it's some sort of hybrid (you could argue that the side effect of releasing the write lock when you've finished modifying the state is to publish an immutable copy of the state to any and all subscribers that want it - except without all that overhead. When viewed in these terms, it makes more intuitive sense that it's safe - provided of course that you don't do anything blocking whilst you're holding a state read-lock). This design also seems to get a fair bit trickier once you get to distributed systems and the need to have proxy objects representing the state and interface of a remote actor. By comparison, in Erlang a reference to an Actor is an immutable process identifier of Pid which is easy to send around and reason about.

But mainly I was thinking about the pattern of data flow: a mailbox allows multiple writers to send data to a single reader (a gather operation, maybe). The actor loop allows the exact opposite: a single reader of the mailbox can then affect multiple things (a scatter) - either by sending out messages to many other actors (in essence, a push action), or by (as I suggest above) modifying state which can be concurrently read by many other actors (correspondingly, a pull action). In my mind's eye, I see a sort of concertina effect as all these messages are pushed into a mailbox, and then slowly the effects of each message spread out again. In some ways it seems slightly odd how powerful this is, but in other ways it makes perfect sense: if you consider a finite state machine then your mailbox is just the stream of events coming into it and you have your little automaton updating the state with some operation combining the current state with the current message. It is the very fact that the next state is dependent on the current state and the current message that requires mutual exclusion around modifying the state. And of course by ensuring that that mutual exclusion lock is (implicitly) held in absence of any other locks that makes actor systems so much easier to reason about and understand - any deadlocks that occur are at the protocol level and, if you model your protocols between actors properly, can be determined statically (though I'm not aware that anyone actually does this - false positives may abound).

This then made makes me think about how, once all actors have done their initialisation and reached the core actor loop, the entire system is purely event driven. When looked at like this, are we really sure actors are enough? Are there not other forms of expression that capture the relation between events as inputs, with state, and an output more cleanly? In particular I'm thinking of things like Join calculus and Functional Reactive Programming. Given that actors are apparently becoming rather more mainstream these days, I wonder if that really means they're only part of the solution: sure I can write large distributed systems that scale, perform well, don't deadlock or livelock and are exceedingly robust. But I can I write them with less code and cleaner semantics?

posted at: 15:02 | path: / | permanent link to this entry