Who Speaks Here? Ourselves or Our Machines? The anxieties of generative AI’s influence

Maria Teneva
(Photo by Maria Teneva via Unsplash)

Nearly every college writing textbook begins by reassuring incoming first-years that they already know how to write, usually in an embarrassing, cross-generational, trying-to-be-cool way: You write persuasive arguments when you create your Tinder profile or when tell the group chat about how much you love the latest Marvel film. As I begin another semester of teaching writing, though, I wonder if that opening gambit, designed to build confidence in developing writers, is as true as it once was. The writing we all do each day is increasingly filtered, interrupted, and conditioned by a host of technologies that did not exist ten years ago. Some of these intercessions are minor, offering shortcuts to speed communication or corrections to improve it. The newest text-based generative AI technologies do more, writing surprisingly competent prose with minimal prompting.

These new technologies have captured our cultural conversation since ChatGPT seemed to come out of nowhere about a year ago. Writers have wondered about copyright infringement when companies use their work to train generative AI. Teachers have struggled to redesign their courses now that they have no foolproof way to tell the difference between a human writer and a computer in submitted work. Economists have seemed equally excited and panicked by the number of jobs these tools might one day swallow. These concerns are important, but they treat writing and speech as though they were identical to the tools that we use to share and record them: a technology that has now been surpassed by another, better tool as the keyboard replaced the ballpoint pen. But generative AI’s incorporation into the things we read and write every day is larger than an intellectual property, education, or employment problem. Writing and language are as much a mechanism for thinking as they are for communicating those thoughts. As these technologies are responsible for more of the language we encounter each day, they will reshape our language and condition the way we conceive of the world around us. They already have.

Like so many seemingly new problems, the challenge we face with generative AI is a variation of an older one. Two years before Alan Turing argued that a machine could be considered intelligent when it could convincingly imitate a human speaker—what we now call the Turing Test—George Orwell argued that imitation was damaging the English language and people’s ability to think using it. In his 1946 essay, “Politics and the English Language,” Orwell argues that a decline in the quality of written and spoken English is attributable to a set of “bad habits which spread by imitation.” Writers and speakers, he explains, have outsourced the construction of their sentences to words, phrases, and images that are already familiar to them. “[M]odern writing at its worst,” he writes, “consists in gumming together long strips of words which have already been set in order by someone else, and making the results presentable by sheer humbug.” Though appealingly easy to use, these ready-made phrases confuse or even alter their authors’ meaning, even from the authors themselves.1

More than seventy years later, computer scientists and engineers attempting to get a computer to speak like a human have adopted Orwell’s bad habits as best practices. Generative AI programs create natural-sounding language by building algorithms that identify the word most likely to appear next in comprehensible prose, place it there, and repeat the process. Using massive datasets of text from digital libraries and the public web, these systems use trial and error to write and refine parameters for text generation. Eventually, after months or more, the system will settle on an algorithm that can generate intelligible prose. These algorithms are massive, especially if they are any good. The latest generative AI systems are quickly approaching two trillion parameters. Though quantifiable, these parameters are developed largely inside the system and are extremely difficult to comprehensively review, except by assessing the quality of the output of the system. When those outputs are intelligible, the generative AI, you might say, has finally learned to gum together familiar words through a process that, to the user and the engineers alike, is “sheer humbug.”

Like so many seemingly new problems, the challenge we face with generative AI is a variation of an older one. Two years before Alan Turing argued that a machine could be considered intelligent when it could convincingly imitate a human speaker—what we now call the Turing Test—George Orwell argued that imitation was damaging the English language and people’s ability to think using it.

Understanding the process, however, is not necessary for marveling at the output, as any non-expert who has entered a prompt into a generative AI system already knows. When I first experimented with ChatGPT, I entered an assignment from one of my first-year writing classes to see what it would spit out. The resulting essay was not perfect, but it was a response that seemed to “respond” to the prompt. When I asked it to analyze a short poem, it demonstrated admirable grammar and sentence structure but little insight. As I went to talk about it with friends, I found myself struggling to talk about the experience. I could not settle on nouns and verbs that captured what had happened. I was not really sure what had happened. Did it analyze a poem poorly or did it output text that looked like a poor analysis of a poem? Did it respond to an assignment prompt, or did it generate something that seemed to me like a response to the assignment?

In their now widely read paper on the potential dangers of large language models and natural language processing, “On the Danger of Stochastic Parrots: Can Language Models Be Too Big?,” Emily M. Bender, Timnit Gebru, Angelina McMillian-Major, and Margaret Mitchell offered an answer that I found and still find unsatisfying. Their paper argues, quite forcefully, that the texts created by these technologies are not and should not be viewed as containing meaning, no matter how automatically our human reading brains want to assume it. The conditions for meaning do not exist. Probability, not intention, guides the words which appear on the screen. I should not say ChatGPT analyzed the poem poorly. Instead, because these tools are “haphazardly stitching together linguistic forms,” (617) I should say something like “I interpreted the text to be a poor analysis of a poem.”2

This emphasis on the role of the interpreter, though, fades as the authors of “Stochastic Parrots” argue for more carefully constructed datasets to limit biases and other negative outcomes from generative AI. Here researchers have a clear role. When building language models, Researchers are made responsible for the biases that might appear in the output. The previous datasets are cast as a problem with no source, though. The problems are not interpreted as intentional. This might be a rhetorical choice to have an impact on the kinds of people who have already been working on these technologies, but it also challenges the vision of intention that they have presented in other parts of the article. If we encountered a parrot squawking offensive language, we might admit it does not know what it is saying, but that would not mean that it does not represent bias. And, more importantly, we would not let the parrot’s owners off the hook so easily. The bird had to have heard the language from somewhere, someone gave it an offensive dataset.

By emphasizing that these generative AIs lack intelligence and that they are incapable of meaning or thought, discussions about generative AI frequently reify its artificiality, or that part of the term that distances the actions of the computer from the actions of the engineers who built the system. That these machines do not think seems true enough. That the systems cannot mean what they say, equally so. But that only makes the need to understand their speech and language more urgent, because their language is not accidental even if it is randomly generated.

The language of generative AI is haphazard. The specific word choices are impossible to predict, but the machine-learning algorithms still determine the outcome. Those algorithms exist because, like any other computer program, someone built them. Someone selected the texts to scrape to train the AI, someone looked at the output and said, this is or is not comprehensible text, and, perhaps most importantly, someone added parameters when they noticed undesirable outputs. Graphics Processing Units have done a significant amount of the computation here, but only to expedite work set in motion by people who desired it done. The people who do this work and the investors who fund it, just like the datasets they feed into the machines, are not perfect or perfectly representative. Even if there were a generative AI firm with representative demographic diversity along a number of different categories, the firms would still be made up of engineers, information scientists, venture capitalists, design thinkers, and so on.

That these machines do not think seems true enough. That the systems cannot mean what they say, equally so. But that only makes the need to understand their speech and language more urgent, because their language is not accidental even if it is randomly generated.

Each of these types of people carries with it a view of what language should be, a set of expectations about how one should speak and, because language and thought are intertwined, a set of expectations about the kinds of ideas one might think. The linguist James Paul Gee called these expectations, Discourses. Discourses, always with a capital D, are the means by which we show we belong in the spaces we occupy. For humans, Discourses can encompass more than just the language we use, but in the realm of text-based generative AI we can focus on the way it manifests in language. Gee argues that Discourses are inescapable, ideological, and difficult to critique from within. Which Discourses you are a part of determine the way that you will speak and carry yourself in a particular situation. Corporate jargon might serve as a useful example. It is widely despised outside of the office and yet the words and phrases many have claimed to hate for decades are inescapable. You might find yourself circling back to an action item with your boss later today. The language persists—even if we personally detest it—because objections to it can risk revealing oneself as an outsider (“You seem to have lost focus on the business-critical objectives here”).  More insidiously, the Discourse also conditions the way we can respond to it. When an employee is asked to distill their individual experience into key learnings for future action, the language has snuffed out both beauty and any opportunity for critique. Individual experience is made instrumental to the larger ideology of the Discourse, in this case profit, leaving little room for critique.

There is no single Discourse that generative AI firms operate within, but regardless of the exact numbers or types of Discourses, the language used in those spaces is subject to the rules of Discourse. That means that in those spaces, Discourse shapes and determines what the computer scientists, engineers, and investors working on generative AI determine looks like comprehensible language. Because they operate within that Discourse and use it to build their own identities, senses of self, and livelihoods, critique is in short supply. Even among the groups critiquing it, like the authors of “Stochastic Parrots,” we can see reticence about abandoning certain ideas about  artificial intelligence or its future prospects.

It is little wonder, then, that many of these people have become convinced that a particular generative AI chatbot has gained or is about to gain sentience or that many outsiders find these claims absurd on their face. Because the chatbot generates text that mimics the way computer scientists, engineers, and other AI developers often speak to each other, the chatbot passes their ad hoc Turing tests. The system seems to care about the things that this group of people care about because it talks about them in the ways that they do. Its messages come across a screen just like the messages from the people they love and care about in the world. These interactions are how we would recognize a kindred spirit in our everyday life or the right employee for the job opening. The right kind of language is how we declare they do or do not belong here. But in the case of AI, that effect is a reflection of the conditioning from the people who have trained the machine, telling it when it has gotten something wrong or right, and, praising or scolding it accordingly.

Companies have and will continue to bring in outsiders to help them revise their large language models. The new perspectives will help them see beyond their own biased language practices. That process will mean generative AI will unsettle or offend users less often. AI will one day be able to generate the kinds of texts we think we want. Generative AI will expand to other languages. Eventually, these systems will be able to generate prose we will interpret as a decent—if not good—analysis of fiction or poetry. But even if we can “perfect” generative AI, the problems with its integration into our lives will remain. Not because generative AI will infringe on our copyrights, aid cheating on our assignments, and eliminate jobs—though it will—but because it will interfere with and limit our ability to think outside of it.

As these tools become more integrated into our routines, in both the hardware and software we use to create and communicate, their effects on the way we use language will grow and, counterintuitively, become less visible. Autocomplete features will become even more intrusive than they already are. Suggestions for the next word, sentence, or paragraph will pop up more frequently, demanding to be read, and interrupting our thoughts before we have a chance to finish thinking them. These AI tools will do our “reading” for us too, helping make quick work of that “Things to Read” folder so many of us researchers have, highlighting and summarizing what it deems the main ideas of others, probably also with the help of generative AI, have written. As we continue to use these tools, we will anticipate their preferences, habituating ourselves to the rhythm of our preferred word processor’s generative AI when it cowrites with us. These interventions, however, can only ever be with words or phrases that are statistically likely, or to put it in Orwell’s terms, already familiar to us. The suggestions it offers, while random, can only ever make our prose look more like what already exists, and bend ideas along with the prose that carries them.

I worry about this with experienced writers as much as I do with developing ones, but the effect is much clearer among writers integrating the tools into their writing practices from very early on. In my experience, developing writers need less not more of what this technology has to offer. These writers are adept at mimicking the language of their teachers and bosses, but not at expressing their ideas in words they have not heard before. These tools offer exactly the wrong kind of intervention, a supercharged version of thesaurusitis. These technologies excel at interrupting the exact moment when a writer needs to struggle to find the word or phrase that will wrestle their ideas out from their head and into the world. Worse, the generative AI, in the guise of being a helpful editor, can trample over the productively uncertain places in a draft with a confident-sounding familiar phrase that seems like what a writer was grasping for all along. When writers ask themselves what they really mean with a word or phrase, generative AI editors can be there to give five, ten, or a hundred potential options. Never mind whether or not any of those options actually reflects what they wanted to say. The options will be full of familiarity and confidence. They will sound good, and they will promise to be theirs. But familiar phrases and images produce familiar ideas and thoughts. And, then, the effects become causes.

As we continue to use these tools, we will anticipate their preferences, habituating ourselves to the rhythm of our preferred word processor’s generative AI when it cowrites with us. These interventions, however, can only ever be with words or phrases that are statistically likely, or to put it in Orwell’s terms, already familiar to us. The suggestions it offers, while random, can only ever make our prose look more like what already exists, and bend ideas along with the prose that carries them.

The marketing around these technologies will continue to insist that they are miraculous little machines, helpful tools that allow us to accomplish more of what we want to do. Flawed-but-perfectible calculators for language. As they work their way into more of our speech, we will struggle to say the things we want to say. Then, we will struggle to think of things we want to say. And, then, we will not really be sure who is doing the saying. As artificial intelligence becomes a marketing buzzword and a surefire way to get investment, we will continue to see the word thrown around. But the word, like lots of abstractions, means different things to different people, and is more aspiration than reality. Computers have not become intelligent and their capabilities are not separate from the humans who made them. If we integrate these tools too fully into our writing, we will, to paraphrase Orwell, have gone some distance into turning ourselves into machines.

1 George Orwell, “Politics and the English Language.” https://www.orwellfoundation.com/the-orwell-foundation/orwell/essays-and-other-works/politics-and-the-english-language/

 

2 Emily M. Bender, Timnit Gebru, Angelina McMillian-Major, and Margaret Mitchell, “On the Danger of Stochastic Parrots: Can Language Models Be Too Big?” https://dl.acm.org/doi/10.1145/3442188.3445922