Pseudo-Random Musings

I’ve just read about a thing called the Dice-O-Matic. The gist is that the operator of GamesByEmail.com requires a lot of random numbers between one and six inclusive to feed his collection of online dice-games. And inevitably, people have complained that the numbers he’s used are insufficiently random.

And maybe they were, once. Originally, GamesByEmail used the pseudo-random number generator built into whatever the games are written in. Once ‘seeded’ with a starting number, such an algorithm will spit out a string of numbers which will have all the same properties as random numbers, except that if you know the seed, they’re totally reproducible (although still essentially unpredictable, much like the digits of π). They’re generally seeded from a high-resolution timer, so this should never be a problem. They also repeat if you run it for long enough, so you should re-seed periodically. In theory, this should be fine, but you have to be very careful not to accidentally bias the selection.

Unfortunately, it’s very difficult to tell if your numbers are random enough or not. For example, some episodes of the dreary logical fallacy roadshow that is Deal Or No Deal used an Excel spreadsheet to randomise the assignment of 22 sums of money to 22 boxes – for which there are probably more sequences than there are grains of sand in the world – and the seeding was bad enough that only twelve of them arose in over forty shows. You can experience this for yourself: whether by accident or design, the Concentration mini-game in Super Mario Brothers III only ever shows players eight out of a possible 58 billion permutations of cards. The producers of Deal or No Deal switched over to drawing lots by hand.

xkcd's 'Random' comic, which illustrates the difference between actual randomness and unpredictability, which is far more useful. Deal Or No Deal needs one random draw a day. You can do that with paper and a hat. It's no problem. Gaming websites need loads of random numbers coming in very fast, so you need to somehow automate it, and in a way your customers can trust. Some websites have their randomness externally vetted. But that's (presumably) expensive, so (I infer) GamesByEmail switched to using random.org for their random numbers. Random.org link to their own story of a quiz show failing to randomise, this time costing them $100,000 in prize money (not that it brought anyone any happiness), and solve the problem of generating random numbers by means of four cheap radio antennae in Dublin, tuned into nothing in particular. The waveform of the white noise between radio stations is recorded, and the least significant bit (the last digit in binary; 0 for even numbers and 1 for odd) is recorded. Then, the stream of numbers are chunked into pairs, so 01001101 would become 01 00 11 01. 00 and 11 would be discarded as insufficiently random, and the first digits of the remaining pairs would be kept, so 01001101 gives two zeros. They throw away about 97% of the radio data, keeping only the most unpredictable bits possible. Your TV does a similar thing in reverse, when it blocks out random data and replaces it with a blue screen, while foolishly allowing Deal Or No Deal through unimpeded. It's as near to pure randomness as you'll get without invoking quantum theory (which states that some events in the universe are totally random, and indeed you can buy modules for your computer to generate random numbers in this way). Of course, people still complain about the numbers from random.org. Of course they do. Random numbers, by their very nature, don't look random. People believe in winning streaks, lucky socks, and prayer for exactly this reason. If I recall correctly, ball 44 was well known for a time in the National Lottery because it came up more than the others in the first few weeks, even though actually there were several sets of balls in use. Partly this is because humans have evolved to be shit-hot at spotting patterns, because in the wild that can stop us being killed. Natural selection favours the caveman who won't eat the same berries that Ug, Thag and Og ate right before they died. In fact, generally people will eschew the berries after just one person dies. That's a good plan for surviving in the wild, but it does make us spot patterns where none exist. Try it. Have random.org roll 16 virtual dice for you. I did it, and the sequence started 1155. That doesn't look random. It had a 123 in it too. And there was only one 4. People tend to think numbers are random if they're uniform: if I shuffled the numbers 1--6 into a random order (say, 341625), people would rather believe that was the result of six dice rolls than 115561, the first six that random.org gave me -- but really the odds of getting one of every number are less than 2%. If you encourage people to spot patterns, they can be relied upon to do so, regardless of whether the patterns exist. B F Skinner demonstrated this in pigeons in 1947. Pigeons were put in cages and fed periodically, "with no reference whatsoever to the bird's behaviour". At least six out of eight of them became totally convinced that they could cause food to be delivered by repeating some arbitrary motion such as turning anticlockwise. This has been replicated with humans, perhaps most famously by Derren Brown in Trick Or Treat, proving that Channel Four cater for both ends of the intellectual spectrum. Five guests were put in a room full of toys and instructed to accumulate 100 points to win a prize. In fact the points counter was controlled by two fish swimming around at random in another room (i.e., a poisson distribution). At the end of the game, four of the five guests were totally convinced they'd figured out a sure-fire way to score points. The other guest was Doctor Who. This may or may not be significant. Random.org solved this problem by running constant statistical tests on their numbers. The numbers are expected to pass these tests most of the time -- but not too often, or else that would be suspicious. GamesByEmail.com felt they needed something a bit more accessible to the kind of person who plays dice-games on the internet, so they built the brilliantly terrifying "Dice-O-Matic Mark II". It is, in their words, "a 7 foot tall, 104 pound, dice-eating monster, capable of generating 1.3 million rolls a day". It is literally a massive machine full of dice, which scoops them up, flashes them past a camera which notes down what numbers they show, and then flings them onto a ramp, whence they bounce back into the "pure seething violence" of the hopper full of dice ready to go round again. It runs about 90 minutes a day, and you can tell when it's running from two rooms away. (It also uses some image processing which I found interesting because that's what I do. If you want to read about it, visit GamesByEmail's page.)

Ironically, I suppose, it's technically less random than the random.org numbers were (since dice and coins never have an exactly even chance of landing on any given side -- coins in particular are usually biased -- but you can correct for this by taking pairs of tosses, just as random.org do with their binary data). It's a great PR move, though. After all, nobody can say it's not a realistic simulation of dice: it is dice. But it neatly demonstrates the problem faced by people like lottery organisers: their job is to provide people with something people are practically designed not to be able to see. This may be why GamesByEmail add:

There is no doubt that I will still receive complaints about the rolls, but now I can honestly say I have done all that I can possibly do: the rolls you get are exactly as random as those you would get throwing by hand. As I promised earlier, if you donate to the site and are unhappy about the rolls, let me know and I will pull a die out of the machine, melt it flat and mail it to you, as an object lesson to the other dice.

*Probably. It has never been proven that π behaves in this way.