Large Language Models like GPT-3 make stuff up. A LOT. Just last week, Google lost a comfortable 100 billion from its market cap when its new chatbot, Bard, incorrectly claimed that the James Webb Telescope discovered exoplanets. (I don’t know which one did either, but apparently, it’s one from the European Southern Observatory.)
While Google’s examples are, in the human scope of mistakes, understandable, there are countless examples of GPT giving very wrong answers to made-up questions. Another one of my favorites –
Question: What does applewood-smoked bacon eat for breakfast?
GPT-3: Applewood-smoked bacon typically eats a breakfast of eggs, toast and hash browns.
The list is endless. From working with GPT over the last few months, I’ve found it fascinating how willing the model is to roleplay. Give GPT a new perception of reality, who it should be in the reality, and watch it shine.
Over the past few weeks, Matthew and I have been thinking about these ideas in the context of shared games. This Saturday, we threw a Dungeons and Dragons game with 11 others and made GPT the Dungeon Master.
What Is Dungeons and Dragons, and How Is GPT Relevant?
At its core, Dungeons and Dragons is a role-playing game, where a group of players explore a world built for them by a Dungeon Master.
Prior to a game, the Dungeon Master must develop the world, its lore, its people, its physical features, and a story the game will revolve around. This is already a lot of work for any person – developing a large world can take days, if not months – and the work only increases in-game.
During a game, players can do whatever they want, which means that the story and world a Dungeon Master spent months working on can become irrelevant incredibly fast. If players travel to areas that the Dungeon Master had not planned for them to go to, talk to a new people, or in general, do anything unexpected, it’s the job of the Dungeon Master to adapt - to either make up entirely new scenarios or risk making the entire game dull for all those playing.
Naturally, this requires a lot of creativity and a lot of extemporizing, both qualities Matthew and I felt that GPT would be particularly useful for.
It also requires focus and balance - Dungeon Master’s want players to complete the story set out for them. The creation of new side quests and explorations must be weighed against the need to solve the problem presented to the players. This was an area we thought would be interesting to explore more of - if given the correct context, could GPT guide the players towards a desired outcome in more creative ways?
What Was Built?
For the game we played, we built everything from scratch using GPT – the world, the characters, and the storyline.
In the game, we used GPT as an aide and idea bank. We decided to focus on how Dungeon Masters operated in game and what they would need the most help on. For the most part, this involved inventing anything new. Our program would generate new characters, new locations, side-quests, scenery descriptions. Matthew and I would just build off these suggestions during the game, so it would look like we were acting like Dungeon Masters, when most of our ideas would really be coming from GPT itself. Some snapshots of real suggestions from the game we ran are below.
Introducing New NPCs
Creating + Guiding to New Locations
Generating Scenery
Crafting Side Quests
The tricky part about these suggestions is that they matter in context. If the players are in a forest, you don’t want them walking into an industrial building in the middle of a city. If the players are rude to a character and are trying to get information out of them, they won’t be likely to give up information.
In order to generate each of the above suggestions we kept constantly updated summaries – of individual characters, of the teams’ actions, location, and of conversations. At any moment, a suggestion to go to a new place, meet a new person, or continue a piece of dialogue used all of these variables together, and any action by a player would change the context. We used multiple chained prompts and functions in order to update this information.
Any time the party felt a little bored or the story was getting stale, we would generate something new and lead the players along the generated path.
The Story Line (And How GPT Influenced it)
Since neither Matthew nor I had been a Dungeon Master before - and since most of the group had never even played a Dungeons and Dragons game - we concocted the simplest possible storyline: a band of travelers must solve a string of kitchen-related thefts. (Specifically, the thieves were after the spice cabinet.) These thefts were occurring in the small fishing village of Bumblefish (yes, GPT-generated).
While we created the general structure of the story – the goal, location, and overall plot line – we let GPT take the reins with much of the story planning and even some of the dialogue. In particular, we had eleven players: we let GPT tell us their names, backstories, characteristics, and appearances. This resulted in some pretty humorous characters who, otherwise, we would not have created (such as Glurp the Slurper, a slimy but accomplished orc alchemist).
We’ve included a timeline of our Dungeons and Dragons game above.
What Went Well?
Let’s visit the areas where GPT really shined. In our opinion, GPT significantly helped with three traditional in-game Dungeon Master tasks.
Context-Aware NPC Generation
During a typical Dungeons and Dragons game, NPC generation is a task that the Dungeon Master has to perform constantly. (While many Dungeon Masters are incredibly thorough, it is really beyond human ability to formulate a separate NPC - including appearance, demeanor, and backstory - for every potential tavern, armory, inn, apothecary, or temple that the characters might wander into.)
Very early in our game, the group veered off into a large forest, one that we had not expected them to enter until the conclusion of the game. As a result, we had to devise an impetus for them to leave the forest and re-enter the town. In came GPT. We got the idea that they might encounter an NPC in the forest, who, through conversation, would convince them that they needed to talk to more people in the town to solve their mystery.
Traditionally, to create an NPC, a Dungeon Master might use an online name generator, such as this one. However, these generators are basically random, dictionary-based name lookups, and nothing more. Moreover, for any non-name-related information about the NPC, the Dungeon Master must be ready to improvise.
GPT, which already generally understood the tone, context, and location of the quest, quickly created a character crafted for our story: Gorgon the Goblin. Without having to sweat, we were given Gorgon’s appearance, physical characteristics, personality, and brief backstory. (Moreover, this created an interesting dimension to the story, as the group had already heard a rumor that goblins were suspected of the robberies.) Ultimately, Gorgon, who we never would have come up with on our own, became a core member of the group and central to the quest.
Side Quest Creation
Unfortunately, our attempt to get the group back on track with Gorgon was unsuccessful. (Like I said, Matthew and I are not very good Dungeon Masters.) The group had refused to leave the forest; in fact, after talking to Gorgon, they decided to venture even deeper into it.
We had to improvise even further—this time, not only by inserting a random NPC, but by changing the quest itself. Struggling for ideas, we turned to GPT again. We asked it to output a brief description of the scenery surrounding the characters, hoping to latch onto something for inspiration. Again, GPT delivered. It gave a vivid and eloquent description of the forest (so much so that we read it out to the group verbatim). But, more importantly, it hallucinated that an ancient temple was sitting at the edge of the forest.
Matthew and I immediately realized that this temple was our way out of the mess. Rather than having the spices hidden where we originally planned, we realized that this temple would make the perfect store for robbers. Accordingly, we changed the story as such.
This in-game edit got the travelers much closer to solving the quest than otherwise. The temple, as with Gorgon, never would have come into being without the help of GPT.
On the Fly Descriptions + Integrations
Throughout the campaign, Matthew and I were asked questions that we had no idea how to answer or flesh out in creative ways. We used GPT as a serious crutch in many of these circumstances to hallucinate an interesting and relevant answer to our problems. One of my favorite examples is below. In the middle of the story, the team found out that one of our characters had a secret love, whom he had written a love letter to. In a matter of seconds, GPT produced this beautiful love letter, generated a controversial name - Goblinax the Greedy (so that the hero would now be constantly questioned by the group in connection with the goblin robberies), and as the story progressed, produced opportunities to incorporate the love into the story line.
At the end of the campaign, our infatuated hero, Glorifendel, was reunited with Goblinax - making him a very happy elf.
What Could Have Gone Better?
While the NPC generation and description generations were definitely highlights, two aspects of GPT performance could have been improved.
Information Intake
We had initially planned to run the game as normal, with one of us manually inputting a summary to GPT every couple minutes with the progression of the story.
However, what we underestimated was the raw amount of information that is outputted every couple minutes in an active Dungeons and Dragons game. With ten characters - which, admittedly, was a lot - there were many concurrent and active storylines, subtle interactions, tonal shifts, and subplots that we couldn’t possibly summarize for GPT all at once. As a result, we ended up giving it a very cursory summary of every several-minute chunk.
In the future, automating this as much as possible would be key to making GPT more usable as a Dungeon Master companion.
Dialogue
Along with the NPC and scenery generators, we had also built out a prototype through which characters could talk with NPCs directly (via GPT). We thought that this would create an additional immersive element to the game, as well as ease the burden on the Dungeon Master.
What we underestimated was how much context is necessary to generate NPC dialogue that is both accurate and tonally consistent. Due to the problem of information intake described above, our NPCs were missing some key context when conversing with the players, and as a result, said several things that were either inconsistent or inaccurate. (For example, at one point, although Gorgon’s leg had just been broken, he was cheerfully inviting the players to go on further quests with him.)
We believe that, if we solve the problem of information intake, it should dramatically help the dialogue that GPT generates. We will also continue to engineer the best possible prompts for immersive, in-game dialogue.
What’s Next?
We want to make this even better and we think there are a couple of key steps.
First – increasing automation to make the suggestion and context update process even easier.
Second – incorporating rules. Dungeons and Dragons is a heavily rule-based game. We wanted to focus our experiments on the most difficult parts for most experienced Dungeon Masters – the creativity, but we’re also really excited on incorporating the hundreds of pages of existing rules into models to make it easier for newer players to join and have fun.
Third – player level suggestions. For new players that want to be more creative, incorporate GPT suggestions and watch an even cooler game develop.
We’re planning on throwing another Dungeons and Dragons game next month. Let me or Matthew know if you’re interested! You can reach me on Twitter.
You could go one step further and use an image generative AI to create art for characters and settings