the imitation game has been won


…and it turns out it never even mattered.

in 1950, Alan Turing proposed an "imitation game" to test if a person could be distinguished from a machine. he was, notably, not the first person to ask if machines could be intelligent — that had been done all the way back in 1637 by René Descartes, before any sort of modern computer had even been practically built — but his take on the idea is the one we most often think of today. Turing proposed a game in which an interrogator tries to tell if the person they are communicating with through written means is a computer or a person.

the first cases of computers being able to actually produce convincing conversational output can be traced all the way back to ELIZA in 1966. of course, ELIZA worked by being very limited, a trick that would continue to be used by many chatbots to seem more convincing for over 50 years after. while these chatbots could convince a fool, they would never be able to convince someone who wasn't one.

but with the development of large language models in the late 2010s and into the early 2020s, things changed. perhaps the sharpest turning point in this regard was the release of GPT-3.5, and the chat-like interface for it called ChatGPT. a computer could, without a doubt, now generate text indistinguishably from a human.

truly, this is a monumental occasion, is it not? this is what we had all been waiting for. the moment that a machine could write like us, surely that would mean it must be intelligent like us too! we've finally created a truly sapient machine, a general artificial intelligence, right?

…right?


unfortunately, it seems that centuries of speculation was all wrong. from philosophy to science fiction, every assumption we made was just wrong. these parrots can answer to everything; but they show no signs of true intelligence.

remember how a large language model works. it's just a statistical model, a list of probabilities, traversing the most likely results. it's less like thinking, and more like if you memorized every book in the library and just repeated things you read. this method works incredibly well at mimicking the look and feel of text, nearly indistinguishably from how a human would write. but the greatest weakness of this method is an inability to learn in the same way a person does.

people are really good at learning. any time you read something, hear something, or see something; you're also processing it, thinking about it, and probably remembering some of it for later. but once you close out the conversation with a large language model like ChatGPT, it forgets all of that. it only seems to remember things because the last few messages are kept in a short buffer.

this also means that anything that isn't well-represented in training data will present a challenge to these systems. this "spotless giraffe" effect — where these systems shy away from the unusual and unknown — is especially obvious when compared to people, who are generally drawn to what's new and different.

truly, to be a person is to describe a spotless giraffe.