Dancing with Myself
AI-generated NPCs currently are extremely powerful, very easy to create, and have almost no use whatsoever. How very 2020s.
The Metaverse. A digital frontier. I tried to picture clusters of information as they moved through the computer. What did they look like? Avatars, poorly rendered? Was the level design seamless? I kept dreaming of a world I thought I'd never see. And then, one day I got in. And then I left again, because it was boring.
— Kevin Flynn, “Tron” (not really)
#pragma startup blog.post.ai.character.design
As gaming technology continues to evolve, so does the types of characters we interact with in our favorite titles. The latest development in this trend is the use of GPT-powered NPCs, which stands for Generative Pre-trained Transformer. GPT-powered NPCs are a type of artificial intelligence that can be used to create more realistic and lifelike characters in games. By taking advantage of the latest advances in machine learning and natural language processing, these NPCs are able to interact with players in a more natural way, allowing them to express emotions, make decisions, and respond to player choi…
(seizes control of the article from the AI)
AI can generate text all day. Interesting text may be another matter. As in blog articles churned out to sell you something or trick ad networks, so too in video game NPCs printed at scale to populate an endless virtual reality full of people trying to sell you something or trick ad networks.
But an actually valid question remains: can AI-generated NPC design be used to, if not completely automate, at least make possible the creation of MMORPG NPCs that can use GPT-style full text interaction to deliver the illusion of in-world characters that respond to the players around them and react to a changing world?
That sounds like an absolutely insane challenge. I should know. I tried to do it 10 years ago. And failed miserably.
Every Villain Needs An Origin Story
In 2013, I was hired by Portalarium to work on Shroud of the Avatar, a Kickstartered RPG that advertised being a spiritual successor of the Ultima games from the 1980s/1990s. For some reason known only to God, I was almost immediately handed responsibility for the NPC conversation system (I think I made the mistake of saying “*I have some ideas for this” within earshot of other people) and that became my life for the next 5 years or so. But first, since this is the era where everything is horrible, a necessary disclaimer!
Disclaimer: Since as part of my leaving Portalarium I signed a fairly restrictive NDA, I really can’t go into detail about the many, MANY things that angered you, personally about the project. The only reason I’m writing this much is because it’s arguable how enforceable an NDA is from a company that no longer exists, but I’d still rather not get into it, whatever value of “it” may be. Thanks, and keep being really, really mad at me personally over my part in making a video game 10 years ago!
Richard Garriott (aka Lord British) had some very specific ideas about how NPCs should react. Specifically, they should be functionally identical to other players - you click on them, a window opens up, and you talk to them, in a natural full text interface, and they respond, exactly as how you would talk to another player. They then ask you to do something, and you may or may not agree to do it, and at some point in the future you may or may not do that thing, and the NPC will recognize that whenever you see them again and tell you “Nice job”.
Remember, as you read this, I agreed to implement this. Just in case you were at all in any doubt about my sanity.
As development progressed. and the first iteration of this system about six months later found its way into the hands of the eagerly awaiting players, some things became very, very obvious.
- This wasn’t going to work.
- Creating an NPC with full-text responses was very time intensive. The less time spent, the less convincing the NPC was.
- This really wasn’t going to work, at all.
- The Kickstarter promised that the shipped game, including all NPC dialog would be localized in most major European languages. You know, just in case this task wasn’t already completely off the rails.
- it’s not working
- The first “scene”, or area in the game, Owl’s Head, was developed in about 6 months. Two months of this was spent scripting the first demo NPC. The second scene, Kingsport, was developed in about 2 months. Further scenes would follow on a monthly cadence, sometimes 2 or 3 a month. Each scene would have anywhere from 25 to 50 NPCs.
- help
- Conversations had to, as part of their structure, embed a “quest-like” structure. I refer to it as this because for the first years of development, Garriott was vehement that the game would have no quests whatsoever. Players, on the other hand, understandably expected an RPG to have, you know, a quest. Or seven. Or more. Or a lot more. I seriously, and continuously, referred to the system internally as “not-a-quest”.
- make it stop
- The less time devoted to each NPC, the harder it was for players to figure out how the hell to make them work. Without a roadmap of the designer’s mind, it was impossible to tell if an NPC wasn’t responding to the next step in a not-a-quest because a flag was missed somewhere in the NPC’s state system, the player didn’t perfectly guess what keyword would trigger that step of responding to a not-a-quest, or if some other unrelated bug, say, loaded the fragment of an NPC state from a nearby other player. As development continued, this got constantly worse and compounded upon itself.
- i seriously cannot drink enough alcohol
- By the time the game was declared content-complete in 2018, everything had combined into an unholy mess that absolutely no one could understand, including me and I wrote the damnable thing. The reviews were predictably scathing.
Shortly after release, I had started drawing up plans to convert all the dialog in the game to a more standard RPG dialog tree system so that the game would be actually, you know, playable. Like every other task with this game, this would have been herculean to the point of impossibility given that there was already thousands of NPCs implemented (for better or worse) and at any rate thanks to low to nonexistent sales almost the entire content team, including me, was laid off very shortly thereafter.
What did we learn from this, class?
- Never, ever volunteer.
- Never believe me when I tell you I can do some impossible weird thing.
- Never work on a Kickstarter project, ever. Trust me on this. Really.
- Oh, you mean NPC dialog. Uh, don’t reinvent the wheel, don’t let scope go wildly off the rails, remember that whatever you do you’re going to have to do hundreds of times if not thousands, and Keep It Simple, Stupid.
But wait! AI is here to save the day!
Ironically, thanks to the Metaverse, another project with unclear goals and an impossible scope, all of these problems may have been solved, thanks to GPT, the god of AI text memes. Several startups (with financial backing 10x or more beyond the budget of my last project) have appeared in the past year to put the promise of AI generated NPCs in your hot, grubby developer paws. The furthest along seems to be Inworld.ai which has APIs for Unreal and Unity available already and quite a bit of buzz.
Inworld uses advanced AI to build generative characters whose personalities, thoughts, memories and behaviors are designed to mimic the deeply social nature of human interaction. This area of the tech industry has exploded this year with generative AI art projects that can churn out painted portraits of people or chat programs such as ChatGPT.
“I’m dead focused on how we build this into a new form of expression, a new way of creation, with these characters,” said Kylan Gibbs, chief product officer and cofounder of Inworld AI, in an interview with GamesBeat. “How do we create a completely new medium for interaction? That’s what I’m excited about.”
I put it through its paces and here’s what I found.
- Interaction with a full text interface is seamless and convincing, courtesy of GPT magic. It really works. It made me cry. Trauma does that. If I had access to this technology 10 years ago, I would be 4% less bitter than I am today.
- Because GPT requires computing power well over what’s currently available on consumer PCs, it requires a cloud computing component. Which means, if you implement this in your game, every line of text a user enters, on every game you ship, will cost a few cents. Now even your single player RPG can have live user support costs!
- There’s very little interaction possible between NPCs (inworld.ai is working actively on this, it seems, so this may change rapidly). Mostly, it’s one-on-one interaction between the player and one NPC.
- There’s no way to have the NPC acquire knowledge of the game world on the fly. You can script the NPC to have knowledge of various things manually, but simply implementing a quest using this system appears to be impossible. Further, there’s no state, or memory, so every interaction is a one-off.
- As is a “feature” of GPT in general — you can’t control what it says, you can only apply guideposts and hope the user doesn’t figure out how to turn your wizard into Heinrich Himmler.
Essentially, at their core, these are still chatbots. Extremely convincing, well implemented chatbots. If you’d like to experiment, here’s my take at one: Lum the Mad, the character from TSR’s Greyhawk and the namesake for my Ultima Online character (and from that, my blogging persona for the past 20 years and dear god I’m old, no wonder my back hurts). You can talk to the NPC via a web interface, I assume in multiple languages, and it will be suitably evil, unless you manage to get into a discussion about Frasier reruns or something. To its credit, it is pretty difficult to knock off track.
So, essentially, it’s a chatbot. A very well implemented chatbot, feature rich and with lots of relevant text derived from about 15 minutes of guidance. But it’s a chatbot, not an NPC. It can’t be relied on for anything gameplay related, just a diversion you dance around with while bored. (I implemented a far less functional chatbot in Shroud as an experiment early in its development. I think a few NPCs in Owl’s Head still have that functionality active, so you can have long meandering conversations about nothing whatsoever and get amusing nonsensical screenshots to show your friends.)
If you need to put 50 chatbots in your game quickly (and don’t mind that they are essentially islands unto themselves that don’t affect the world around them and that any time someone talks to one you’ll be billed), this level of technology is perfect. If you were looking for AI-driven NPCs that could be used for quest support or plot development, to quote Lum the Mad, we have been seeking revenge for this transgression for many years, but it remains elusive.
OK, Smartypants, so what’s next?
The thing is, we are actually tantalizingly close to AI-supported NPC generation that’s useful in a gaming context. Here’s what needs to happen:
- The ability to build a text model for each and every NPC and have them resident on a client’s PC (ideal) or a self-contained affordable cloud server (less so). This is essentially a function of Moore’s Law at this point - right now, those models currently require a level of computing support beyond the reach of almost all consumer-level PCs, but that will inevitably change.
- API support for a gaming framework that gives access to a state machine (aka NPC memory) and also allows for less technical users to create and tweak in-game NPCs. Unity has some promising candidates with ORK Framework and Quest Machine - adding AI-enriched dialogue support to these tools would be a godsend for independent RPG developers.
It’s easy to see both of these happening within a year or two. And at that point, you really will talk to NPCs in your game of choice with a free-form text interface, and when not giving you the next quest they’ll probably try to sell you bitcoins, because everything continues to be horrible.