the-great-game/docs/codox/Voice-acting-considered-harmful.html

49 lines
25 KiB
HTML
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<!DOCTYPE html PUBLIC ""
"">
<html><head><meta charset="UTF-8" /><title>Voice acting considered harmful</title><link rel="stylesheet" type="text/css" href="css/default.css" /><link rel="stylesheet" type="text/css" href="css/highlight.css" /><script type="text/javascript" src="js/highlight.min.js"></script><script type="text/javascript" src="js/jquery.min.js"></script><script type="text/javascript" src="js/page_effects.js"></script><script>hljs.initHighlightingOnLoad();</script></head><body><div id="header"><h2>Generated by <a href="https://github.com/weavejester/codox">Codox</a></h2><h1><a href="index.html"><span class="project-title"><span class="project-name">The-great-game</span> <span class="project-version">0.1.2-SNAPSHOT</span></span></a></h1></div><div class="sidebar primary"><h3 class="no-link"><span class="inner">Project</span></h3><ul class="index-link"><li class="depth-1 "><a href="index.html"><div class="inner">Index</div></a></li></ul><h3 class="no-link"><span class="inner">Topics</span></h3><ul><li class="depth-1 "><a href="Baking-the-world.html"><div class="inner"><span>Baking the world</span></div></a></li><li class="depth-1 "><a href="Canonical-dictionary.html"><div class="inner"><span>A Canonical dictionary for this documentation</span></div></a></li><li class="depth-1 "><a href="Dynamic-consequences.html"><div class="inner"><span>On the consequences of a dynamic game environment for storytelling</span></div></a></li><li class="depth-1 "><a href="Game_Play.html"><div class="inner"><span>Game Play</span></div></a></li><li class="depth-1 "><a href="Gossip_scripted_plot_and_Johnny_Silverhand.html"><div class="inner"><span>Gossip, scripted plot, and Johnny Silverhand</span></div></a></li><li class="depth-1 "><a href="Organic_Quests.html"><div class="inner"><span>Organic Quests</span></div></a></li><li class="depth-1 "><a href="Pathmaking.html"><div class="inner"><span>Pathmaking</span></div></a></li><li class="depth-1 "><a href="Populating-a-game-world.html"><div class="inner"><span>Populating a game world</span></div></a></li><li class="depth-1 "><a href="Roadmap.html"><div class="inner"><span>Roadmap</span></div></a></li><li class="depth-1 "><a href="Settling-a-game-world.html"><div class="inner"><span>Settling a game world</span></div></a></li><li class="depth-1 "><a href="Simulation-layers.html"><div class="inner"><span>Simulation layers</span></div></a></li><li class="depth-1 "><a href="The-spread-of-knowledge-in-a-large-game-world.html"><div class="inner"><span>The spread of knowledge in a large game world</span></div></a></li><li class="depth-1 "><a href="Uncanny_dialogue.html"><div class="inner"><span>The Uncanny Valley, and dynamically generated dialogue</span></div></a></li><li class="depth-1 current"><a href="Voice-acting-considered-harmful.html"><div class="inner"><span>Voice acting considered harmful</span></div></a></li><li class="depth-1 "><a href="building_on_microworld.html"><div class="inner"><span>Building on Microworld</span></div></a></li><li class="depth-1 "><a href="economy.html"><div class="inner"><span>Game world economy</span></div></a></li><li class="depth-1 "><a href="intro.html"><div class="inner"><span>Introduction to the-great-game</span></div></a></li><li class="depth-1 "><a href="modelling_trading_cost_and_risk.html"><div class="inner"><span>Modelling trading cost and risk</span></div></a></li><li class="depth-1 "><a href="naming-of-characters.html"><div class="inner"><span>Naming of Characters</span></div></a></li><li class="depth-1 "><a href="on-dying.html"><div class="inner"><span>On Dying</span></div></a></li><li class="depth-1 "><a href="sandbox.html"><div class="inner"><span>Sandbox</span></div></a></li><li class="depth-1 "><a href="sexual-dimorphism.html"><div class="inner"><span>Sexual dimorphism</span></div></a></li></ul><h3 class="no-link"><span class="inner">Namespaces</span></h3><ul><li class="depth-1"><div class="no-link"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>cc</span></div></div></li><li class="depth-2"><div class="no-link"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>journeyman</span></div></div></li><li class="depth-3"><div class="no-link"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>the-great-game</span></div></div></li><li class="depth-4"><div class="no-link"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>agent</span></div></div></li><li class="depth-5"><a href="cc.journeyman.the-great-game.agent.agent.html"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>agent</span></div></a></li><li class="depth-4"><div class="no-link"><div class="inner"><span class="tree" style="top: -52px;"><span class="top" style="height: 61px;"></span><span class="bottom"></span></span><span>buildings</span></div></div></li><li class="depth-5 branch"><a href="cc.journeyman.the-great-game.buildings.module.html"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>module</span></div></a></li><li class="depth-5"><a href="cc.journeyman.the-great-game.buildings.rectangular.html"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>rectangular</span></div></a></li><li class="depth-4 branch"><a href="cc.journeyman.the-great-game.cloverage.html"><div class="inner"><span class="tree" style="top: -83px;"><span class="top" style="height: 92px;"></span><span class="bottom"></span></span><span>cloverage</span></div></a></li><li class="depth-4"><div class="no-link"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>gossip</span></div></div></li><li class="depth-5 branch"><a href="cc.journeyman.the-great-game.gossip.gossip.html"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>gossip</span></div></a></li><li class="depth-5"><a href="cc.journeyman.the-great-game.gossip.news-items.html"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>news-items</span></div></a></li><li class="depth-4"><div class="no-link"><div class="inner"><span class="tree" style="top: -83px;"><span class="top" style="height: 92px;"></span><span class="bottom"></span></span><span>holdings</span></div></div></li><li class="depth-5"><a href="cc.journeyman.the-great-game.holdings.holding.html"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>holding</span></div></a></li><li class="depth-4"><div class="no-link"><div class="inner"><span class="tree" style="top: -52px;"><span class="top" style="height: 61px;"></span><span class="bottom"></span></span><span>location</span></div></div></li><li class="depth-5"><a href="cc.journeyman.the-great-game.location.location.html"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>location</span></div></a></li><li class="depth-4"><div class="no-link"><div class="inner"><span class="tree" style="top: -52px;"><span class="top" style="height: 61px;"></span><span class="bottom"></span></span><span>merchants</span></div></div></li><li class="depth-5 branch"><a href="cc.journeyman.the-great-game.merchants.markets.html"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>markets</span></div></a></li><li class="depth-5 branch"><a href="cc.journeyman.the-great-game.merchants.merchant-utils.html"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>merchant-utils</span></div></a></li><li class="depth-5 branch"><a href="cc.journeyman.the-great-game.merchants.merchants.html"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>merchants</span></div></a></li><li class="depth-5 branch"><a href="cc.journeyman.the-great-game.merchants.planning.html"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>planning</span></div></a></li><li class="depth-5"><div class="no-link"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>strategies</span></div></div></li><li class="depth-6"><a href="cc.journeyman.the-great-game.merchants.strategies.simple.html"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>simple</span></div></a></li><li class="depth-4"><div class="no-link"><div class="inner"><span class="tree" style="top: -207px;"><span class="top" style="height: 216px;"></span><span class="bottom"></span></span><span>objects</span></div></div></li><li class="depth-5 branch"><a href="cc.journeyman.the-great-game.objects.container.html"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>container</span></div></a></li><li class="depth-5"><a href="cc.journeyman.the-great-game.objects.game-object.html"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>game-object</span></div></a></li><li class="depth-4 branch"><a href="cc.journeyman.the-great-game.playroom.html"><div class="inner"><span class="tree" style="top: -83px;"><span class="top" style="height: 92px;"></span><span class="bottom"></span></span><span>playroom</span></div></a></li><li class="depth-4 branch"><a href="cc.journeyman.the-great-game.time.html"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>time</span></div></a></li><li class="depth-4 branch"><a href="cc.journeyman.the-great-game.utils.html"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>utils</span></div></a></li><li class="depth-4"><div class="no-link"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>world</span></div></div></li><li class="depth-5 branch"><a href="cc.journeyman.the-great-game.world.heightmap.html"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>heightmap</span></div></a></li><li class="depth-5 branch"><a href="cc.journeyman.the-great-game.world.location.html"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>location</span></div></a></li><li class="depth-5 branch"><a href="cc.journeyman.the-great-game.world.mw.html"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>mw</span></div></a></li><li class="depth-5 branch"><a href="cc.journeyman.the-great-game.world.routes.html"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>routes</span></div></a></li><li class="depth-5 branch"><a href="cc.journeyman.the-great-game.world.run.html"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>run</span></div></a></li><li class="depth-5"><a href="cc.journeyman.the-great-game.world.world.html"><div class="inner"><span class="tree"><span class="top"></span><span class="bottom"></span></span><span>world</span></div></a></li></ul></div><div class="document" id="content"><div class="doc"><div class="markdown"><h1><a href="#voice-acting-considered-harmful" name="voice-acting-considered-harmful"></a>Voice acting considered harmful</h1>
<h4><a href="#wednesday-25-february-2015" name="wednesday-25-february-2015"></a>Wednesday, 25 February 2015</h4>
<p><img src="https://3.bp.blogspot.com/-ZI90HLjEcuo/VO4f-yXP3sI/AAAAAAAAZt4/C0hQ7hScWyM/s1600/witcher_conversation.jpg" alt="The Witcher: Conversation with Kalkstein" /></p>
<p>Long, long, time ago, I can still remember when… we played (and wrote) adventure games where the user typed at a command line, and the system printed back at them. A Read-Eval-Print loop in the classic Lisp sense, and I wrote my adventure games in Lisp. I used the same opportunistic parser whether the developer was building the game Create a new room north of here called dungeon-3 the player was playing the game Pick up the rusty sword and go north or the player was talking to a non-player character Say to the wizard can you tell me the way to the castle Of course, the parser didnt understand English. It worked on trees of words, in which terminal nodes were actions and branching nodes were key words, and it had the property that any word it didnt recognise at that point in sentence was a noise word and could be ignored. A few special hacks (such as the, a, or an was an indicator that what came next was probably a noun phrase, and thus that if there was more than one sword in the players immediate environment the one that was wanted was the one tagged with the adjective rusty), and you ended up with a parser that most of the time convincingly interpreted most of what the player threw at it.</p>
<p>Text adventures fell into desuetude partly because they werent graphic, but mainly because people didnt find typing natural, or became dissatisfied with the repertoire of their parsers. Trying to find exactly the right combination tokens to persuade the game to carry out some simple action is not fun, its just frustrating, and it turned people off. Which is a shame because just at the time when people were abandoning text adventures we were beginning to have command parsers which were actually pretty good. Mine, I think, were good - you could have a pretty natural conversation with them, and in building mode, when it hit a sorry I dont understand point, it allowed you to input a path of keywords and a Lisp function so that in future it would understand.</p>
<p>So much, so <a href="http://www.csee.umbc.edu/courses/331/papers/eliza.html">Eliza</a>.</p>
<p>Modern role-playing games - the evolutionary successors of those high and far off text adventures - dont have text input. Instead, at each stage in a conversation, the user is offered a choice of three or four canned responses, and can pick one; very often what the players character actually says then differs from the text the user has chosen, often with differences of nuance which the user feels (s)he didnt intend. And the non-player-characters response is similarly canned. Indeed, the vast majority of non-player characters in most games have a repertoire, if one may call it that, of only one sentence. Others will have one shallow conversational tree, addressing one minor quest or plot-point.</p>
<p>If you want to talk to them about anything else - well, you just cant.</p>
<p>Only a very few key non-player characters will have a large repertoire of conversational trees, relevant to all parts of the plot. And even those trees are not deep. You soon exhaust them; the characters ability to simulate real agency just isnt there.</p>
<p>I first wrote about the limiting effects of voice acting in <a href="../../2008/02/the-witcher-story-telling-of-high-order.html">my review of the original Witcher game</a>, back in 2008; things havent got better.</p>
<h2><a href="#on-phones-speaking" name="on-phones-speaking"></a>On phones: speaking</h2>
<p>In my pocket I carry a phone. Its not big: 127 x 64.9 x 8.6mm. A small thing.</p>
<p>When I first used Android phones for navigation, I used to delight in their pronunciation of Scots placenames - pronouncing them phonetically, as spelled, and as though their spelling were modern English. Whats delightful about Scots placenames is that they are linguistically and orthographically so varied - their components may be Brythonic, Goidaelic, Anglian, Norn, French, English, or even Latin; and very frequently they combine elements of more than one language (Benlaw Hill, anyone? Derrywoodwachy?).</p>
<p>Yes, gentle reader, this does seem a long way from game design; be patient, Im getting there. But Im going to digress even further for first…</p>
<p>There have been orthographic changes, and pronunciation changes consequent on orthographic changes. For example, medieval Scots used the letter <a href="http://en.wikipedia.org/wiki/Yogh">Yogh</a> (ȝ), which isnt present in the English alphabet. So when Edinburgh printers in the early modern period bought type for their printing presses from England, there was no Yogh in the font. So they substituted Zed. So we get names like Dalȝiel, Kirkgunȝeon, Menȝies, Cockenȝie. How do you pronounce them?</p>
<p>The letter that looks like a z is pronounced rather like a y; so</p>
<ul>
<li>Deeyell</li>
<li>Kirkgunyeon</li>
<li>Mingis</li>
</ul>
<p>and… drumroll…</p>
<ul>
<li>Cockenzie.</li>
</ul>
<p>What happened?</p>
<p>Well, Dalȝiel and Menȝies are personal names, and people are protective of their own names. Kirkgunȝeon is a small, unimportant place, and all the locals know how it is pronounced. Scots folk, are, after all, used to Scots orthography and its peculiarities. So those names havent changed.</p>
<p>But at Cockenȝie, another small, unimportant place, a nuclear power station was built. The nuclear power station was built by people (mostly) from England, who didnt know about Yogh or the peculiarities of Scots orthography - and were possibly too arrogant to care. So they called it Cockenzie. And as there were so many more of them and they had so much higher status than the locals, their name stuck, and nowadays even local people mostly say Cockenzie, as though it were spelled with a Zed. Because, of course, it is spelled with a Zed. Because, as any British schoolchild knows, theres no Yogh in the alphabet.</p>
<p>Except, of course, when there is.</p>
<p>Another more interesting example of the same thing is <a href="http://www.journeyman.cc/placenames/place?id=153">Kirkcudbright</a>. Its a town built around the kirk (church) of saint Cuthbert. So how does it come to have a d in it? And why is it pronounced Kirkoobry? Well, the venerable Cuthbert pronounced his name in a way which would be represented in modern English as Coothbrecht, but he spelled it Cuðbrecht. See that ‘ð’? Thats not a d, its an Eth. Because Cuðbrecht was Anglian, and the Anglian alphabet had <a href="http://en.wikipedia.org/wiki/Eth">Eth</a>; its pronounced as a soft th, and Icelandic still has it (as well as Thorn, þ, a hard th sound). Medieval scribes didnt know about Eth, so in copying out ð they wrote the more familiar d. The local people, however, mostly couldnt read, so the pronunciation of the name didnt change with the change in spelling (although the pronunciation, too, has drifted a little with time).</p>
<p>So, in brief, pronouncing Scots placenames is hard, and there are a lot of curious rules, and consequently its not surprising that five years ago, listening to Androids pronunciation of Scots placenames was really funny.</p>
<p>But whats really curious is that now it isnt. Now, it rarely makes a mistake. Now, Android can do text to speech on unusual and perverse orthography, and get it right better than 95% of the time - and manage a reasonably natural speaking voice while doing so. On a small, low power machine which fits in my pocket.</p>
<h2><a href="#on-phones-listening" name="on-phones-listening"></a>On phones: listening</h2>
<p>But navigation is not all I can do with my phone. I can also dictate. By which I dont mean I can make a voice recording, play it back later and type what I hear, although, of course, I can. I mean I can dictate, for example, an email, and see it in text on my phone before I send it. It quickly learned my peculiarities of diction, and it now needs very little correction. On a small, low power machine which fits in my pocket.</p>
<h2><a href="#and-breathe" name="and-breathe"></a>And breathe</h2>
<p>Right, so where am I going with all this? Well, we interact with modern computer role playing games through very restricted, entirely scripted dialogues. Why do we do so? Why, on our modern machines with huge amounts of store, do our non-player characters - and worse still, our player character, our own avatar - have such restricted repertoires?</p>
<p>Because they are voice acted. Dont get me wrong, voice acting makes a game far more engaging. But for voice acting to work, the people doing the acting have to know not only the full range of sentences that their character is going to speak, but also roughly how they feel (angry? sad? excited?) when they say it. Ten years ago, voice acting was probably the only way you could have got this immediacy into games, because ten years ago, text-to-speech systems were pretty crude - think of Stephen Hawkings voice synthesiser. But now, Edinburgh Universitys <a href="http://www.cstr.ed.ac.uk/projects/festival/morevoices.html">open source synthesiser</a> is pretty good, and comes with twenty-four voices (and seeing its open source, you can of course add your own). Speech to text was probably better ten years ago - think of <a href="http://en.wikipedia.org/wiki/Dragon_NaturallySpeaking">Dragon Naturally Speaking</a> - but it was proprietary software, and used a fair proportion of a machines horsepower. Now theres (among others) Carnegie Mellons open source <a href="http://cmusphinx.sourceforge.net/">Sphinx</a> engine, which can quickly adapt to your voice.</p>
<p>So, we have text-to-speech engines which can generate from samples of many different voices, and speech to text engines which can easily be tuned to your particular voice. Theres even a program called <a href="http://www.voiceattack.com/">Voice Attack</a>, built on top of Microsofts proprietary speech to text engine, which already allows you to <a href="https://www.youtube.com/watch?v=8dnJ--pSjdE">control games with speech</a>. Where does that take us?</p>
<p>Well, we already know how to make sophisticated natural language parsers for text, given moderately limited domains - we dont need full natural language comprehension here.</p>
<h2><a href="#you-may-think-its-a-long-way-down-the-road-to-the-chemist" name="you-may-think-its-a-long-way-down-the-road-to-the-chemist"></a>You may think its a long way down the road to the chemist</h2>
<p>There are things one needs to know in a game world. For example: I need a sword, wheres the nearest swordsmith? In a real quasi-medieval world, certainly every soldier would be able to tell you, and everyone from the swordsmiths town or village. Very celebrated swordsmiths would be known more widely.</p>
<p>And the thing is, the game engine knows where the nearest swordsmith is. It knows what potion will heal what wound, and what herbs and what tincture to use to make it. It knows which meats are good to eat, and which inns have rooms free. It knows good campsites. It knows where there be dragons. It knows where the treasure is hid. It knows - as far as the game and its plot are concerned - everything.</p>
<p>So to make an in-game Siri - an omniscient companion you could ask anything of - would be easy. Trivial. It also wouldnt add verisimilitude to the game. But to model which non-player characters know what is not that much harder. Local people know whats where in their locality. Merchants know the prices in nearby markets. They, and minstrels, know the game-worlds news - major events that affect the plot. Apothecaries, alchemists and witches know the properties of herbs and minerals.</p>
<p>And to model which non-player characters are friendly, and willing to answer your every question; which neutral or busy, and liable to answer tersely; and which actively hostile, and likely, if they answer at all, to deliberately mislead - thats not very much harder.</p>
<p>Im not arguing that voice acting, and scripted dialogue trees, should be done away with altogether. They still have a use, as cutscenes do, to advance plot. And Im not suggesting that we use voice to control the player characters movements and actions - Im not not suggesting that we should say run north; attack the troll with the rusty sword. Keyboards and mice may be awkward ways to control action, but theyre better than that. Bur I am suggesting that one should be able to talk to any (supposedly sentient) character in the game, and have them talk reasonably sensibly back. As one can already do physically in wandering an open world, a full voice interaction system would allow one to go off piste - to leave the limited, constrained pre-scripted interaction of the voice-acted dialogue tree. And that has got to make our worlds, and our interactions with them, richer, more surprising, more engaging.</p>
<p>A hybrid system neednt be hard to achieve, neednt be jarring in use. You can record the phonemes of your voice actors voice, so that the same character will have roughly the same voice - the same timbre, the same vowel sounds, the same characteristics of  pronunciation - whether in a voice acted dialogue or in a generated one.</p>
<p>We dont need to let voice acting limit the repertoires of our characters any more. And we shouldnt.</p></div></div></div></body></html>