I tried Nvidia’s AI NPC tech and showed it to a game dev
Pexels, NvidiaNvidia is one of the world’s biggest AI advocates, and its new ACE AI NPC technology could offer a glimpse into the future of your favorite AAA gaming titles.
Nvidia is one of the leaders in the AI space. Their chips power OpenAI’s ChatGPT, and they’ve quietly been iterating and integrating new AI features like DLSS 3 into their graphics cards and gaming laptops.
As part of this push, the company has teamed up with InWorld to showcase a demo, which was initially revealed at GDC 2024 to show how in-game NPCs, powered by generative AI tools and speech could function in a AAA game of the future. Nvidia has already been working with companies like Ubisoft to deliver similar tech demos.
Named Covert Protocol, the demo is a Hitman-meets-Cyberpunk hybrid, where your character is tasked with getting the room number of a key figure named Martin Laine. To do this, you have to coax one of the three NPCs in the demo to offer that knowledge by acquiring contextual clues and social engineering.
Talking to Nvidia’s AI NPCs
Firstly, we encounter Tae, a bellboy who remains courteous, but bored with his job. Quizzing Tae reveals that the Hotel is understaffed, causing delays for guests to check into their rooms. While this is functional for the mission, the text-to-speech model Nvidia uses for its NPCs can sometimes come across as stilted, somewhat breaking the immersion of thinking you’re talking to an NPC.
Attempting to break the boundaries, I quizzed Tae about life, and his aspirations. Which, according to him, is to become a bartender, mixing his signature drinks. Asking an AI for a killer cocktail recommendation could have been neat. However, no matter how much I questioned Tae, he never actually gave up any details about the ingredients he used to make them.
It’s possible that Tae’s initial biography developed in the InWorld character creator didn’t have any details about his aspirations other than wanting to be a bartender. Limitations of the tech were on show here, as while Tae is dynamic, he does veer the conversation away from topics he’s not “allowed” to be talking about.
I then played a small trick on the AI and suggested that he post his cocktail-making sessions onto TikTok, something from our real world, which he acknowledged. This betrays the futuristic Cyberpunk feel of the demo and is likely something developers could tweak inside of InWorld’s character generation sheets.
When quizzed on who owns TikTok however, Tae dodged the question entirely, showing that some guardrails are in place for the AI to process queries that could be deemed political.
There’s no denying that what’s on show here is impressive. There’s no question about that. However, there’s still an overall artificial feeling in the line delivery that the demo never shakes, especially when it comes to certain NPCs. One such instance is with Sophia, the stern front-of-house representative at the hotel. Sophia’s rigid responses might be suited to her overall role in the tech demo, but the AI-generated line delivery needs significant work to be believable.
Sweet little lies
However, to test how much your voice inputs matter, I spoke to hotshot tech entrepreneur Diego and tried to push the boundaries of the tech, as he was frustrated about his hotel room not being ready.
To see just how far I could go with the NPC, I told him that his room wasn’t ready because someone had left a stinky surprise while cleaning his room. It took some convincing. I also told him that the conference he was attending at the hotel had also been canceled due to an incident of potent flatulence where the event was being held.
His responses were, by default, standoffish and guarded, demanding answers for who had told me this. I referred to Tae the bellboy, and front-of-house Sophia. He quickly got up to talk to Sophia, and blurted out the room number of his partner to discuss what they were going to do next. Mission accomplished.
While in the moment, this was a humorous way of completing the demo’s objective, it’s also entirely unbelievable. Would you believe a total stranger making wild explanations about your hotel room not being prepared, and a conference you’re supposed to attend being canceled over a fart? No.
No emotional stakes
One concurrent theme throughout my experience testing Nvidia’s AI NPCs is that each one I talked to was seemingly locked into a single emotional state, no matter how incendiary I got with them. For Tae, I questioned his life aspiration of being a bartender quite harshly, and he responded with the same tone as he always did: Tired, and pretty nonchalant, while Diego did exclaim some form of my tall tales being ridiculous, he never responded in an emotive manner.
When I questioned an Nvidia representative running the demo on these emotional shifts, they simply responded that it was down to game developers with how they choose to deploy the tech, and that this was just a simple example of ACE’s possibilities. But, emotion is at the core of many modern gaming experiences, and without that facet in a fictionalized world, things could feel pretty lifeless as a result.
Ex-AAA game developer weighs in on Nvidia ACE demo
I spoke to Oliver Clarke-Smith, director of the narrative-driven Paradise Killer, the upcoming Promise Mascot Agency, and designer on Sony hit Until Dawn about what he thinks of Nvidia’s AI NPC tech. I shared the Covert Protocol demo footage, as well as my experiences with it.
“I think the technology solves a content production problem but doesn’t fulfill the need to have quality content.” Clarke-Smith continued that many of the most beloved games often have common writing quirks or charms, which the Nvidia ACE demo “totally lacks”.
The Kaizen Game Works founder continues: “The Covert Protocol demo is full of mismatched delivery and facial animation. This can be improved of course, but there will always be something that happens that will break the illusion”, noting Tae’s odd intonation during the Covert Protocol demo footage I had shared.
Watch Dogs: Legion also used a form of AI to generate NPCs your character could control, which Clarke-Smith draws a comparison to.
“Expanding [AI-generated content] out to entire AI-generated conversations is going to remove what little seasoning there is in the AAA porridge.”
Driving down budgets, but at what cost?
“When I worked on Until Dawn, it became very apparent that the cinematic interactive drama genre has a shelf life dictated by increased asset cost.” Clarke-Smith said, before detailing every facet of how narrative-driven games are carefully considered.
“Every second of mocap has a cost. Speccing out every branch and scene has a cost. You only get Hollywood actors for a certain amount of time. The mocap clean-up and hooking up in-game have a cost. This all limits the amount of branching you can do, and limits the choices the player can make.”
But, with AI tools like Nvidia ACE, that could become a thing of the past. Clarke-Smith stated that implementing the possibilities that AI could introduce into a game like Until Dawn as hand-authored content would be “unthinkable” because “the cost and logistics [of the game’s development] would be enormous.”
Having worked on a mega-budget game like Until Dawn, he further notes that traditionally created content should be valued: “Until Dawn was loved because it was hand-authored. The campy dialogue, the sometimes weird editing, the sometimes not-quite-right facial expressions; these all made Until Dawn charming.” He did note the role AI could play in such games, but it’s currently missing that special sauce that makes great art – humanity.
“An AI-driven interactive drama would give you amazing possibilities which is an intoxicating idea – but it’d have none of the charm, so what’s the point?”
Clarke-Smith concluded: “The ability to create endless amounts of content is not the solution to the insane cost of high-end games production” and “creating lots of bland content isn’t going to suddenly increase the market”.