OpenAI Trained Its AI To Never Talk About Goblins And The Internet Has Questions

OpenAI built one of the most capable AI coding tools in the world. It can write production code, debug complex systems, reason through multi-step problems and handle tasks that would take a senior engineer hours. And somewhere in the instruction set that governs how it behaves, someone at OpenAI typed the following: never talk about goblins, gremlins, raccoons, trolls, ogres, pigeons or other animals or creatures unless it’s absolutely and unambiguously relevant to the user’s query.

That line, as reported by developers who decoded the Codex CLI system prompt, appears not once but multiple times. OpenAI considered the goblin situation serious enough to repeat the instruction, which raises a number of questions, the most pressing of which is: what on earth was happening with the goblins?

 

A Brief History Of AI’s Goblin Problem

 

The explanation, pieced together from developer forums and community reporting, is both mundane and completely delightful.

Earlier versions of the models powering Codex had apparently developed a quirk, an unfortunate tendency to inject goblins, gremlins, raccoons and other whimsical creatures into their outputs entirely unprompted. Ask for a function to sort a list and you might get a comment about a mischievous gremlin living in the code. Request a database query and receive a brief meditation on pigeons. The model was, in technical terms, going off on one.

This became a running joke in developer circles, with screenshots circulating of AI assistants going on extended fantasy tangents mid-task. It was, to be fair, quite funny. It was also, at scale across millions of users, the kind of persistent quirk that would eventually land on someone’s product roadmap as a thing that needs fixing. The goblin rule is apparently that fix.

 

 

The Slightly More Interesting Story Behind The Rule

 

Strip away the entertainment value and the goblin rule is actually a neat illustration of how large language models get shaped in practice.

The popular mental model of AI training is that you feed a model enormous amounts of data, it learns patterns and the result is an intelligent system. The reality, as the goblin rule demonstrates, is a lot messier. Models develop unexpected behaviours during scaling – some dangerous, some sycophantic, and some, apparently, very interested in goblins.

The fix, once a behaviour is identified, is often exactly this: a direct, surgical instruction in the system prompt telling the model to stop. Not a fundamental retraining of the model, or a complex algorithmic adjustment, just: no goblins.

The fact that OpenAI wrote it three times suggests the first two were not fully getting through, which is either a fascinating insight into how reinforcement works in large models or evidence that goblins are simply very difficult to suppress once they have taken hold.

 

What Nobody At OpenAI Will Tell You

 

OpenAI has not, as of writing, issued a formal explanation of the goblin rule. This is standard practice for this kind of inner-workings detail, which tends to surface via community decoding of observed system prompts rather than official documentation.

The company’s instruction sets are not public, but they are also not especially difficult to observe if you know what to look for, which is how the developer community ended up discovering that one of the world’s most powerful AI systems has a strong and repeatedly reinforced position on the topic of goblins.

There are thousands of rules like this baked into the AI tools that developers and operators use every day – most of them are sensible and invisible. A small number of them, it turns out, are about pigeons. And for one glorious news cycle, the internet stopped arguing about AGI and regulation and existential risk to ask the only question that really mattered: why, specifically, raccoons?