Engineers, researchers, and intrepid hobbyists of all sorts have proven that the classic first-person shooter Doom can be played on almost anything, including a lawnmower and even gut bacteria. On Wednesday, Adrian de Wynter, a principal applied scientist at Microsoft, proved that the popular AI chatbot ChatGPT can play Doom—it’s just not very good at it.
Seeing what devices and other contraptions can run Doom has become an increasingly popular pastime for hackers, researchers, and tech enthusiasts. To make Doom work with ChatGPT, de Wynter paired it with OpenAI’s multimodal GPT-4V (Vision) to get the chatbot to play the game.
The results of the Doom/ChatGPT experiment showed that despite the advances in GPT-4 and its vision-enhanced variant, the AI model could not independently run Doom due to limitations in input and image rendering.
“For example, if the model fell into an acid pool, and then got stuck on a wall, it would ‘forget’ that it is taking damage because of the acid,” de Wynter said, “and then get stuck and die.”
Another issue facing de Wynter was the AI model's habit of hallucinating and making up explanations for its actions, or lying that it completed an action. That left Doom’s Space Marine at the mercy of rampaging monsters.
GPT-4, de Wynter explained, managed to get to the last room in the game… but only once. Doom’s simplicity, he said, makes it easy to work with due to its portability, and its open-source nature allows for better benchmarks by which to measure intelligent agents because Doom requires heavy reasoning capabilities—like planning in the heat of the moment.
“It’s interesting!” de Wynter told Decrypt’s GG. “It did originate mostly as a meme (‘Can my toaster run Doom?’) due to its portability and open-source code. That’s mostly why it stays as the game of choice.”
De Wynter emphasized that the project was done solely in his capacity as a researcher at the University of York, and is not related at all to his work with Microsoft.
“Debugging took a lot of time. I normally dumped the frames and just went over them to make sure nothing was breaking,” he said, noting constant issues, including the model trying to get out of the map through the window. “Eventually I gave up and turned the frames into GIFs."
De Wynter’s project is just the latest in a series of experiments that aim to play Doom in unusual places.
Last year, after the launch of the Ordinals protocol, a stripped-down version of Doom was inscribed on the Bitcoin blockchain as Inscription 466. Earlier this year, a similar project added a full-fledged version of Doom to the Dogecoin blockchain.
While this AI attempt at playing Doom may be a one-off, de Wynter said he has ideas for future gaming experiments using large language models (LLMs).
“My main research interest is related to LLM reasoning and planning capabilities, so games, in general, are an excellent testbench for this,” he said. “Strategy games are a bit off the table at the moment, but I’m wondering whether simpler games (or other models) could yield better results.”
Edited by Andrew Hayward