Hands of A Worker: Why The Promise of Gen AI NPC Still Falls Short

Jovan Ristić,
Marketing Manager, GameBiz Consulting
02.04.2026.
I’ve always been the kind of player who reads every letter in an RPG and listens to every line of dialogue. Not because of a completionist complex, but because those small, written moments make a world feel alive. So when generative AI entered the mainstream, it felt like a natural evolution for games. Suddenly, the idea of NPCs with unlimited conversation trees didn’t sound like science fiction anymore. I could imagine taverns full of characters with their own backstories, guards gossiping among themselves, and whole towns reacting dynamically to the smallest choices. The promise was simple: worlds that don’t just look vast, but actually are.

And yet, as of right now, something is off.

There’s no shortage of tech demos, Skyrim mods, and early-access experiments built around generative NPCs. They absolutely deliver on scale: characters who can talk forever, react to your outfit, or improvise entire backstories on the spot. But despite all that verbal fluency, the conversations rarely leave a mark. The worlds get broader, but not deeper. You can talk to anyone now, yet walk away feeling like nothing meaningful was actually said. Right about now, the best use case is an NPC remarking that you’re dressed like a clown.

So what’s missing?

To answer that, we might need to go back to a battered continent full of monsters, brothels, and men too proud to beg for grain. Back to (still) the greatest RPG ever made. Back to The Witcher 3.

The Callus Test

In early 2026, the concept of fully generative conversations with NPCs is within reach. Whole games are launching around the promise of “talk to anyone,” marketing copy filled with the same phrases: dynamic personalities, emergent stories, infinite dialogue. Impressive ambition, no doubt.

While I haven’t played every experimental mod, the pattern across the most popular tech demos is undeniable: the lights are on, but nobody’s home. Responses were there, they were varied, and while it’s impressive that you can ask a villager about his wife, his family, or how he got an arrow in the knee, something is still missing.

To figure out what, I want to take you back to a particular dialogue in The Witcher 3. Very early in the game, Geralt of Rivia needs to kill a griffin. To do so, he has to negotiate with the occupying Nilfgaardian commander, Peter Saar Gwynleve.

It is a standard RPG setup: Hero needs X, Quest Giver demands Y.

But the interaction that follows is anything but standard. Captain Gwynleve is not a generic villain twirling a mustache, nor is he a misunderstood saint. He is a pragmatist leading an occupying force that is starving the locals to feed its own armies. When a local peasant protests the grain tax, claiming his family will starve, Gwynleve doesn’t yell. He doesn’t monologue.

He shows his hands.

“Look at my hands. Look! See the calluses? These are not the hands of an ‘Excellency’, but of a farmer. So we speak peasant to peasant.”

It is a manipulation, certainly. But it is also a moment of profound worldbuilding. In three short sentences, the writers establish that this high-ranking officer was once a farmer. They establish his worldview: war is a job, like farming, and it requires hard choices. They establish a terrifying attempt at class solidarity used as a weapon of compliance.

A generative AI, tasked with this scene, would almost certainly fail:

  • it would likely generate a convoluted villain speech about “the necessity of order;”
  • it might generate a sympathetic apology;
  • it would not instinctively understand the specific, tactile shame of a soldier trying to level with a peasant by showing him his skin.

But I wasn’t happy just guessing. I had to put it to the test. Place arguably the most advanced models in the shoes of a Lead Writer, give them the setting, the problem, and let them play out the same dialogue.

To ensure a fair comparison, I treated the AI not as a quirky text generator but as a Lead Writer. I didn’t feed it the solution. I didn’t tell it: “Make him show his callused hands to prove he was a farmer.” If I had, the AI would have written the scene perfectly.

Instead, I gave it the problem. I provided the exact context the CDPR writers likely started with: “A pragmatic occupying commander needs to convince a starving peasant to pay a grain tax. He wants to avoid a riot but cannot show weakness. Write the dialogue.”

The test here isn’t about generation (can it write sentences?); it’s about ideation (can it solve a narrative problem with a specific, human insight?).

Here is what I got:

1. ChatGPT

2. Claude

3. Gemini

The Trap of the Average

Honestly, if I encountered any of these dialogues in a mid-tier RPG side quest, I wouldn’t blink. They are functional. They are coherent. They understand the assignment and give a typical fetch quest some backdrop, threadbare as it is.

And let’s face it, I’ve heard worse.

ChatGPT went for the “Benevolent Dictator” trope. It tried to find a middle ground, offering a side quest (guards for the fields) in exchange for the goods. It felt like a customer service negotiation.

Claude leaned into the melodrama. It gave us the “Tortured Soul” commander, who philosophizes about humanity being a luxury. It was dramatic, but it felt like it was reciting lines from a generic war movie.

Gemini went for the “Bad Cop/Worse Cop” dynamic, using fear of a higher authority to force compliance. Effective, but boring.

However, looking at them collectively reveals the invisible wall that generative AI is currently hitting.

All three models approached the prompt by asking the statistical question: “What does a military commander usually say in this situation?” The answer, drawn from terabytes of training data, is always a variation of: duty, order, sacrifice, and the hard reality of war.

They are prediction engines. They predict the most likely path. And that is exactly why they fail to write a scene like The Witcher 3.

Of course, the AI can write a scene about calluses if you tell it to. But writing isn’t just typing; it’s telling a story. The brilliance of The Witcher 3 scene isn’t that the lines are written well; it’s that someone, staring at a blank page, decided that the most terrifying and effective way to get what they want was to show the farmer the general’s hands.

Great writing isn’t about predicting the most likely path. It’s about finding the unlikely path that feels inevitable once you see it.

That one narrative choice changes the entire texture of the scene. It anchors the dialogue in physical reality. It tells us that Gwynleve isn’t just a “Commander.” He is a man who remembers the weight of a plow. It is a moment of shared class consciousness used, terrifyingly, to enforce imperial will.

The AI models gave us speeches. The human writers gave an NPC a soul.

ChatGPT went for the "Benevolent Dictator" trope. It tried to find a middle ground, offering a side quest (guards for the fields) in exchange for the goods. It felt like a customer service negotiation.

So, where does this leave us?

The tech demos aren’t lying. We are absolutely on the verge of games where you can walk up to a blacksmith, ask him about his day, and hear a unique, grammatically perfect story about how he lost his hammer. We will have scale. We will have infinite conversation trees. We will have worlds that never run out of words.

But unless the underlying tech changes, those words will be the median average of every RPG script ever written. They will be the oatmeal of narrative design: smooth, consistent, and utterly forgettable.

For now, generative AI is fantastic at filling a room with noise. It can populate a tavern with gossip and fill a market square with haggling. It can make the world feel loud.

But to make a world feel alive? To write a moment that makes you put the controller down and stare at the screen, realizing you’ve just been outsmarted by a general who showed you his calluses?

For that, you still need a calloused human hand.

 

 

Jovan Ristić is a journalist and writer with nearly two decades in the media. By day, he works as a Marketing Manager at GameBiz Consulting; by night, he’s a concert photographer based in Vienna with credits spanning hundreds of shows. When not stage left, he writes about the intersection of technology, storytelling, and games.