Stop making Dave look like an asshole

Another in my seven-hundred-ninety-one part trilogy on why any up-to-date recording of a sequence of ongoing events can exist only a single point in space.

The story of Ann, Carol, Dave, and Tiger

It's summer, prime season for plaid pants, hip flasks, spiked shoes, and polyester shirts. GolfCo stores everywhere are jam-packed with the leisurely, the retired, and bluetooth-earpieced executives. This all may seem like an idyllic american scene, but trouble is afoot!

It's Monday morning. Dave, manager of the flagship Palm Springs location, is reading his email. He's kinda hung over from a rowdy night at Hunters.

  • Memo A1 from Ann at corporate in Chicago - Sent 6:52pm on Saturday
    "Take down the posters of Tiger Woods Immediately"

"Uh, I think we have one in the back. I'll go take it down" Mutters Dave.
He finds the poster in the back of the store from six months ago, and takes it down.

  • Memo C1 from Carol at the regional office in LA - Sent at 3:45am on Sunday
    "Hey, we sent you some new promo posters featuring TW. Can you please put them up?"

"Ah, they must have had some contractual branding requirement or something. Gotta keep it fresh!" Dave puts up the new posters prominently in the front windows.

Later that day, Dave seems to notice some sour looks from the customers, but nobody says anything.

He goes home, cracks open a PBR, and flips on the news. To his surprise, he sees that Tiger was in the news again. Apparently he was on the Ashley Madison list, and crashed his car and his genitals into half of a Swedish women's soccer team, not necessarily in that order.

Dave immediately realizes what had happened:

"Goddammit, the time on Carol's computer is fucked up again! Her email was actually sent first!"

If only Ann had referenced Carol's email, and said something like:
"This supersedes Memo C1" then Dave wouldn't have looked like such an asshole.

This is called causal consistency. It's not just a good idea at GolfCo, it's also a good idea for databases. The reason is that sometimes clocks can be wrong, but sometimes it's impossible for them to be right.

What if...

What if, rather than LA and Chicago, Carol and Ann were on Earth and Mars respectively? Mars experiences time in a completely different frame of reference than earth. The closest Mars ever gets to the earth is 34 million miles. The farthest is 250 million miles. Depending on the alignment of the planets, that's somewhere between 3 minutes and 22 minutes just for light to travel one way! This means that no event on earth can, according to the laws of physics, effect anyone or anything on mars for at least 3 to 22 minutes. The reverse is also true.

Well that's preposterous!! you might say. What kind of bullshit Elon Musk kool-aid are you drinking anyway? Mars is just an abstract plaything for NASA, and the subject of mediocre movies featuring Liev Schreiber. What the hell does that have to do with reality?

Here's the fun part: The same laws of physics apply to LA and Chicago, albeit on a much smaller scale. They may be only two thousand miles apart, but the fact remains, nothing can travel faster than 671 million miles per hour. Nothing; not light, not heat, not gravity, and dammed sure not GolfCo memos. The fastest ANY information can ever make that trip is 10 milliseconds.

Well shit son, 10 milliseconds? Isn't that faster than the blink of an eye?
Yup, the average eye blink is at least 100 milliseconds, so who cares?

Well, scenarios such as the GolfCo memos happen all the time in computing, but on a vastly smaller timescale. Sometimes it's on the order of picoseconds (that is, trillionths of a second.) In a network, two programs on different computers (technically even on the same computer) whether across the country, across the room, or even inches apart can disagree on the order of two or more events.

Using clocks to determine their sequence seems like a fine idea at first, but it breaks down very fast when the Carol program and the Ann program are sending their memos a few billionths of a second apart. It's bad enough when one of the clocks is "wrong", but worse still, it's physically impossible for the clocks to be "right" when they're further apart in distance than the two events are in time. (They can be said to be outside each other's light cones)

Ahhh, but why wouldn't Carol and Ann simply use a common memo numbering system? Then it'd be Memo #1 and Memo #2. Easy to tell which was the most recent!

Not so fast, this is where our nemesis linearizability comes to bite us:

If the company uses a single sequential series of memo numbers, then neither Ann, nor Carol, nor anybody else can send out a memo, without first checking with everybody who is authorized to send memos to make sure that the next memo number hasn't been taken. If the phones are down for an 20 minutes at the Chicago office, then all business has to stop, because the LA office can't be sure they aren't duplicating a memo number. If memo numbers could be duplicated, then the stores couldn't be sure of what order they were sent, or if they had consulted each other in the decision-making process for the instructions handed down.

This is Brewer's (CAP) theorem. It's actually mathematically, provably impossible to determine a total order of events among disconnected/distant systems without waiting. You have to either wait to coordinate, or settle for a partial order of events. No exceptions.

Just about every relational database in common use today uses a linearizable, "CAP vulnerable" approach in the interest of using simplified, sequential record numbers. The problem is: when the phones go down in Chicago, business in both offices must screech to a halt. That's bad for the bottom line. It's also a pain in the ass to have to call Chicago so often. They have better things to do with their time.

On the other hand, causal references, such as: "Memo A1 supersedes C1" CAN allow GolfCo to continue to do business when the phones are down in Chicago. The stores can tell at a glance if Carol knew about Ann's memo, or not. (Also known as a partial order: C1 before A1)

So, the moral of the story is:

  • Don't use clocks, they're wrong.
    "A person with a watch knows what time it is. A person with two watches is never sure"
  • Don't use total orders, they're limiting
    (Unless you're very patient)
  • Use causal references, they're compatible with the laws of physics!

Whatever you do, don't make Dave look like an asshole.

Daniel Norman1 Comment