Kinect-ing Together Writing and Gesture Through NUI Technologies

Kevin Brock, University of South Carolina
David M. Rieder, North Carolina State University

(scroll down to begin)

Introduction

This article is meant to accomplish two interrelated goals: 1) introduce and explain the technical and writerly implications of emBody(dekaaz) {, a digital humanities writing project co-developed by Kevin Brock and David M. Rieder for the sixty-third annual Conference on College Composition and Communication; 2) identify some of the ways in which the use of Microsoft’s Kinect sensor can be leveraged by digital writers and writing theorists as a platform for new forms of writing with natural user interface (NUI) technologies. As we explain in the course of this article, NUI technologies, of which the Kinect is a powerful and popular example, challenge us to fundamentally rethink some of our core assumptions about writing and reading. Working from our experiences developing emBody(dekaaz) {, we focus on the following four reconsiderations. First, writing is an executable script or score. Its end goal is not representational. It is not meant to communicate a stable, static meaning to a reader. Rather, like John Cage’s 4'33", which Liz Kotz argues was the inspiration for a generation of postwar event scores associated with Conceptualism, as well as the kind of late-sixties happenings about which Geoffrey Sirc has written, writing is meant to make something happen. Of course, the script or score can be read, as scholars of Critical Code Studies (CCS) have demonstrated, but the ultimate purpose of executable writing is generative.

A page from John Cage's manuscript for 4'33.

Figure 1. A page from John Cage’s 4'33", courtesy of the New York Public Library’s John Cage Unbound living archive.

Second—and related to the first point—reading has been transformed into an epideictic experience. The letters and words on a page or screen are recognizable and readable, but this readability is less important. Put differently, the letters and words displayed on the wall of the convention center during CCCC 2012 are meant to engage its participants in the moment, in a real-time happening of sorts, during which, if we're lucky, we might find ourselves, as Gilles Deleuze once described it, in the middle of Spinoza.[1] The reason this discovery happens—the reason as to why it can happen— is that the letters and words have broken free of the traditional print-based alphabetic mandate to which we've grown all-too accustomed. As Richard Lanham has described the transformation of text in digital media, the alphabet can now think. It is no longer merely a dumb terminal for speech. (In fact, it never was, but emerging technologies enable us to recognize and emphasize this fact in new and hypermediate ways.)

The third rethink that we experienced developing emBody(dekaaz) { is related to the surface or substrate of writing (as well as of reading). As a point of comparison, conventionally, the surface of print-based alphabetic writing is a neutral backdrop. It is a flat, forgettable substrate. As readers, we are not supposed to notice it; we are supposed to look past and through it and perceive only the content transmitted through its form. Working with the Microsoft Kinect, the surface of writing is transformed dramatically in two ways. First, it is no longer a single planar—i.e., a flat, two-dimensional—surface, and neither is it static or atemporal. Working with the Kinect’s depth data, the writing space is revealed as a dynamic, three-dimensional, volumetric zone that travels through time. From a writer’s point of view, the space in front of the Kinect is the basis for a rheology of Heraclitean flow and change.[2] Moreover, the conventional notion of depth in front of the Kinect is transformed into an abstract conceptualization of space. As Manuel De Landa might argue, we are not working with three dimensions of metric depth; rather, we are working with a manifold comprising three or more degrees of freedom—plus the additional degrees derived from time. Second, based on the ways in which a digital writer scripts the four-dimensional zone—the reader is an active participant, a co-author of the unfolding, emerging multimodal and transmodal text. Sarah J. Arroyo would call the space that we have described as a choric structure of possibilities. As participants experience the feedback of their movements in real-time, they are implicated in a writing surface that extends beyond the wall on which the output of the project was displayed. They are in the middle of the writing.

The fourth and final point to which we arrived is that writing was and will eventually (again) be more than an alphabetic technology for the transmission of spoken language. Extending linguist Roy Harris’s reminder that the origins of writing were not alphabetic, we dare say that the future(s) of writing do not have to, and in fact will not, be exclusively alphabetic, either. We suggest this is because a broader definition of writing is necessary in order to recognize how and why projects like ours are writing; related to this, a broader definition of writing would empower digital writers to reconcile their technological interests with their disciplinary training. At CCCC, reactions to our project were positive, but most participants valued it as art, i.e., as something other than (and thus not really) writing. The project celebrated writing, but its varied components—the software programming, the Kinect sensor, the above-described, four-dimensional space, and the cables—are not conventional writing technologies per se. Considering how committed our field is to alphabetic writing, we understand this kind of reaction. In fact, we celebrate it for the commitments that it implies to the power and privilege of literacy in a nation where far too many children and adults are struggling to learn how to read and write. However, the alphabet is one technology among numerous other technologies; so, we’d like to be able to call ourselves writers, and our activities writing, even when not focused solely or even primarily on alphabetic text.

Drawing Letters

Our project helped us look afresh at the alphabet, at one of the core technologies of what is conventionally called writing. Inspired by Harris’ critiques of alphabetism, our project enabled us to change the way we think about the alphabet in two ways. First, we stopped to notice the letters of the alphabet are a series of line drawings, and that the origins of the term writing are more broadly related to line-making practices. Before modern historians reduced the broader history of writing to an evolutionary set of discoveries leading toward the alphabet, writing was a term that included drawing, etching, tracing, scratching and scraping. Harris explains that, in Ancient Greek, the verb γράφειν originally meant in Homer engrave, scratch, scrape (X). In Ancient Egyptian, a single word (sesh, transliterated as sS) denoted both writing and drawing, and Harris states that this broader, inclusive definition of writing survived in English into the sixteenth century. Harris concludes that

The late restriction of such words to designate alphabetic writing hardly warrants the narrow perspective adopted by those historians of the subject who take for granted that graphic signs count as writing only when used for purposes which alphabetic writing was later to fulfill. (29-30)

What we find especially interesting about Harris’ point is how it can be turned toward the future in which we found ourselves while working with the Kinect. For us, writing must be more broadly defined in order for our field to expand with the technological futures in which we find ourselves. If we focus exclusively on alphabetic writing, we will find ourselves increasingly boxed out of future forms of communication that could be valued as writerly.

The second way in which our assumptions about the alphabet were changed relates to representation. Conventionally, writing is described as a representational technology. Each letter represents a distinct sound, and a sequence of letters represents the same sequence of sounds that, together, comprises a word. The problem with this description is that it falls apart under scrutiny: writing is not always or only representation. The simple truth is that speech is not a sequential set of distinct sounds; this is an assumption born of the influence of literacy. Linguist Michael Studdert-Kennedy states this truth plainly when he writes, We do not normally speak phoneme by phoneme, syllable by syllable, or even word by word [... but rather] any isolable articulatory or acoustic segment arises as a vector of forces from more than one linguistic segment, while any particular linguistic segment distributes its forces over several articulatory and acoustic segments (68).

If speech is not a sequence of discrete and distinct atomistic sounds, than what, precisely, does writing represent? Harris concludes that writing is like a score that captures enough of the structure of the acoustical flow of speech to be useful—but capturing the general structure of speech is not the same thing as representing speech.

These two points help us pry apart the naturalized association of speech and writing in order to recognize that what we're really dealing with is an analog and continuous flow of speech and a particular practice of making lines (i.e., writing). Over time, the two converge in a form with which we're all too familiar (i.e., the alphabet). When we step back from this particular form of writing, expanding our understanding of what it can be in digital media, we realize that writing is the use of a technology of lines (circuits, cables, etc.) to capture some aspect of the analog flow of life, which is what we did with the Kinect.. In the case of the Kinect, writing is a way of inscribing into the four-dimensional flow of space-time a novel, participatory choragraphy.

With the above points in mind, what follows from this introduction is a series of topics in which we expand on various aspects of each point to which we arrived as part of our work on this project.

The Nature of Writing

While it does not always get defined as such, writing has long been recognized as an activity and not simply the graphical representation of spoken language; this recognition has been addressed extensively by both artists and scholars (from compositionists to sociologists to archaeologists). Such a quality is significant in that the term activity suggests a meaningful exertion that is both physical and mental. La Monte Young, in his conceptual art piece Composition 1960 #10 to Bob Morris, suggests the deceptively difficult nature of writing—that is, inscribing meaning into and through action—in its single-sentence edict: Draw a straight line and follow it (114). One who attempts to follow this instruction finds rather quickly that the act is, and means, much more than it initially might appear to.

The activity of writing, of marking shapes to signify something else, as a way to construct meaning, dates back to at least 30,000 B.C., with simple shapes that nonetheless have complex and significant social and cultural values for author and reader alike (Leroi-Gourhan 191). Kenneth Burke has described humanity by virtue of its ability to act symbolically, that is, in a way charged with meaning—as Burke defines it, humanity (man) is the symbol-using (symbol-making, symbol-misusing) animal (16). In the first chapter of his Of Grammatology, Jacques Derrida points to physical activity as natural writing as an extension of the divine: the good and natural [writing] is the divine inscription in the heart and the soul; the perverse and artful is technique, exiled in the exteriority of the body (17). This notion of the self as a being written through the signification it created and was imparted by other forces becomes a crucial component of writing studies. In essence, we are only through the inscriptive acts (i.e., acts of writing) we engage or are engaged in.

A screenshot of one author's portrait converted into ASCII text art.

Figure 2. A screenshot of one author’s portrait converted into ASCII text art—in essence, an act of being-written. A similar image exists in plain text format, as does a version in HTML. All portraits made with HasciiCam.

However, despite an increasingly broadened understanding of meaning-making in general and of writing in particular, there nonetheless have been critics who have lamented the passing or potential passing of writing as an important means of communication—the lo-fi lament that video killed the radio star, so to speak. For example, Vilém Flusser has argued that writing is a physical gesture, a meaningful action that makes its mark upon the world:

To write is to in-scribe, to penetrate a surface, and a written text is an inscription, although as a matter of fact it is in the majority of cases an onscription [... I]t is a penetrating gesture which informs a surface. (Gesture 1)

Despite Flusser’s observation of the inscriptive and penetrative act of writing, he argues against the potential of writing in digital or electronic environments. He suggests (and later recants somewhat, in an afterword to the second edition of Does Writing Have a Future?) that the act of writing has been given over to—i.e., replaced by—numerical (digital) computation, an activity he views as inherently distinct from writing and its physical enactment (Does Writing 163-4). Flusser’s complaint stems from a complicated definition of writing, upon which he relies even as he recognizes and considers the wide variety of ways in which individuals inscribe meaning onto and into the world without requiring handwritten or typed letters formed into strings of words and sentences (Does Writing 3). For Flusser and similar-minded writing scholars, writing requires one of these historical holdovers: either (1) strings of alphabetic characters in linear sequences, usually recorded in ink on paper and meant to signify phonetic sounds or (2) physical scratching, engraving, stamping, digging-out, or otherwise creating meaning into or onto some surface.

This is not to suggest that all scholars of writing share Flusser’s position. In fact, the field of rhetoric and composition has been pushing against the boundaries of the definition of composing as writing (meaning linear strings of alphabetic characters combined together) for decades, with the number of scholars interested in this exploration continuing to grow. Gunther Kress and other members of the New London Group have argued for design and layout as forms of producing meaning as well as of arranging meaning delivered through other modes of communication (Kress 132; New London Group 73-75). Richard Lanham draws attention to the visually meaningful qualities of a text as more and other than a vehicle for transmitting phonetic language to a reader, stressing the need to recognize a bi-stable oscillation between looking at and through a given text:

Look THROUGH a text and you are in the familiar world of the Newtonian interlude, where facts were facts, the world was really out there, folks had central selves, and the best writing style dropped from the writer as simply and directly as a stone falls to the ground, precisely as Thoreau counseled. Look AT a text, however, and we have deconstructed the Newtonian world into Pirandello’s and yearn to act naturally. (Lanham 5)

It should come as no surprise that Lanham’s reference to looking at a text is described as a way to act naturally, given the similar outlooks on writing- (and reading-) as-acting provided above from Burke and Derrida alike. The philosophical shift that occurs when focusing not on the tradition of literate writing but instead on the potential for meaningful invention and engagement with a kinect text—a text with symbolic energy and agency—is a significant movement away from conventional notions of writing is (can be) or is not (cannot be).

Writing has also had its definition extended by composition scholars to include technological interfaces and environments, considering these as forms as well as means of writing/acting. Kathleen Blake Yancey has argued for a reconsideration of composition from a sole focus on print-based writing to a hybrid of print and digitally-oriented forms of creative expression and argument (Made 307). Cynthia and Richard Selfe have pointed to the affordances and constraints of software interfaces as agents for political effect that can, and do, exert influence on their users, most often to support forms of social and cultural oppression over particular populations (484). This call for recognition of interfaces’ capability for meaningful impact has been explored further by critics like Teena Carnegie, who has considered the interface as a kind of rhetorical exordium meant to prepare an audience for an evem more complex and dynamic argument (the use of a given software-hardware assemblage). Similarly, David M. Rieder suggests that the critical stance needed for addressing interfacial influence may itself be changing, due in part to the emergence of natural user interfaces (or NUIs) that make use of an individual’s physical body as a central component for computational activity.

It is this concern that fuels our current argument: how can natural user interfaces help us understand and explore the potential for writing with as well as through the meaningful, inscriptive activities of the physical body? How can virtual spaces serve to draw attention to the significance(s) of those activities? In this article, we examine NUIs as a means of writing, with a focus on the Microsoft Kinect, in order to explore some possibilities for writing in hybrid digital-physical spaces with the body. We tested some of this potential by constructing a software program that used the Kinect, along with a number of active participants, in order to compose new texts via the digital recording and interpreting of dynamic physical movement.

Rise of the Natural User Interface

In late 2010, Microsoft unveiled a new user input device, which it called the Kinect. The Kinect differs from earlier input devices in that it possesses no requirement for physical contact with a user. Instead, those individuals’ bodies would themselves operate as controllers through the use of several cameras in the Kinect and software designed to recognize particular types of motion- and depth-based input (see Figure 3) as well as recognizing voice commands via a built-in microphone. The Kinect was initially developed to serve as a gaming interface for its XBox 360 video game console—an interface that would allow individuals to play games without relying on physical devices (usually assemblages of joysticks and buttons) that would normally constrain interaction with the system.

A diagram demonstrating how the Kinect's cameras capture visual data, as color values and as depth (distance) from the appropriate sensor.

Figure 3. The Microsoft Kinect’s cameras gathering and processing depth and color data, courtesy of socialphy.com.

The Kinect builds upon efforts to innovate an accessible and easily-usable NUI, a form of human-computer interface that breaks from conventional paradigms of user activity (e.g., mouse and keyboard or video game console controller). Competitors’ NUIs have been more restrained in execution than that of the Kinect—the Nintendo Wii-mote appears to be a cross between a television remote and a traditional Nintendo controller, while the Sony Playstation Move controller looks like nothing so much as a large plastic lollipop. As a result of the controllers’ designs, gameplay with these devices is very much constrained to specific and limited movements centered on the positioning and movement of the controllers themselves. In contrast, the Kinect’s cameras allow it to construct (through the aforementioned motion- and depth-based data gathered from scanning a given space) a rough shape and calculated skeleton of a user, which it then reads as the skeleton/user moves in order to interpret the user’s actions accurately according to the system’s recognized gestures and physical behaviors.

However, perhaps the most exciting and promising aspect of the Kinect is that, because it was developed by Microsoft and connects to the XBox 360 via a USB connection, it is also able to connect to personal computers without requiring the XBox as an intermediary device. In fact, Microsoft released to the public a source development kit, or SDK, for Kinect-related development on its Windows platform, in the hopes of attracting software developers to construct programs specifically for the Kinect device. However, various groups of hackers have since released other toolkits and software libraries that extended the Kinect’s use in Mac OSX and Linux environments without requiring the official Microsoft SDK. This means that the Kinect has effectively become a hacked (and thus a hackable) tool whose potential for, and variety of, use has expanded significantly from its initially-anticipated function(s).

Because of the possibilities opened up by hacking the device—meaning the potential made virtual from individuals’ ability to experiment with the Kinect in ways unanticipated or undesired by Microsoft—we can begin approaching a consideration of how physical activity recorded and computed by the Kinect could offer us new, or at least rediscovered, ways to think about and engage in writing with our bodies. This is not to suggest that the Kinect or similar technologies are required in order to explore the potential of meaningful gesture in a physical or virtual space; however, its affordances and constraints provide scholars of composition and rhetoric with a new lens through which to reflect upon how and why we communicate as we do with, and through, gestural activity.

Gesture as Meaningful Communication

Three gestures identified by Bulwer in his Chirologia. From top to bottom: silentium postulo--a demand for silence--a hand raised with palm outward; invito--an invitation--a hand with palm turned toward the speaker; and dimitto--a diminishment--a hand, palm outward, arm stretched away from the speaker.

Figure 4. Examples of gesture from Bulwer’s Chirologia. Top: Gestus XVI, Silentium postulo (I demand silence); middle: Gestus XXI, Invito (I invite); bottom: Gestus XXII, Dimitto (I diminish).

Gesture and movement have served as powerful forms of communicating meaning since humans have attempted to share information with one another, and rhetorical scholars have explored that idea since antiquity, considering gesture a critical component of the canon of delivery. In the seventeenth century, John Bulwer took up both chirologia (what Bulwer calls the natural language of the hand and chironomia (for Bulwer, the art of manual rhetoric) in a pair of treatises on those subjects. As Bulwer explains near the beginning of his Chirologia, the hand (and with it, the arm),

in a full, and majestic way of expression, presents the signifying faculties of the soul, and the inward discourse of reason; and as another tongue, which we may justly call the spokesman of the body, it speaks for all the members thereof, denoting their suffrages, and including their votes. So that whatsoever thought can be delivered, or made significantly manifest by the united motions and connotative endeavors of all the other members, the same may be as evidently exhibited by sole devoyre and discoursing gestures of the hand. (15-16)

For Bulwer, the gestures of the hand (and, by association, the human body as a whole) are not simply stylistic ornamentations upon verbal speech; instead, they impart significant meaning through their own activities. For an audience familiar with Christian customs, a pair of hands pressed together at the palms, fingers pointed up, suggests a prayer-like supplication regardless of whether any such prayer is spoken aloud. Bulwer provides several meanings that could be intended by such a gesture: Thus we acknowledge our offenses, ask mercy, beg relief, pay our vows, imprecate, complain, submit, invoke, and are suppliant (23). How an individual gestures before and after such an act provides context and depth that a descriptive statement could not fully complete or replace.

A woman performs a series of hand-based gestures recognizable by the Kinect: pushing, grabbing, and waving.

Figure 5. A woman performs a series of hand-based gestures recognizable by the Kinect, including (starting at the top left) push to select an object on the screen, grab to manipulate or navigate objects on the screen, and wave to identify oneself—or one’s hand—to the Kinect as a valid user/entity (courtesy of Microsoft Gallery). Compare these with the gestures depicted in Bulwer’s texts, such as those shown in Figure 4.

Just as it serves as a powerful means of communication in regards to spoken and written (textual) arguments, so too does the hand play a central role in the gestural use of the Microsoft Kinect. While a human skeleton can be interpreted from a combination of color and depth data (calculated by the Kinect as a blob-like entity whose shape is roughly that of a bipedal human), it is the hand that communicates directly with the Kinect sensor. For example, when upraised, the hand can be recognized by the Kinect as a kind of cursor, and a hand’s movement can be tracked as a semi-distinct object capable of providing a specific set of input gestures to the device. In Figure 5, a woman is seen not just holding her hand in a meaningful static position but moving that hand in order to communicate meaningfully with the Kinect over a period of time.

Bulwer’s descriptions of gestures similar to those made by the woman in Figure 4, in relation to classical oration and conventional use when accompanying spoken language, frame the individual’s ability to write to and with the Kinect in an intriguing light. For example, the push gesture meant to indicate the presence of a valid user has a categorical ancestor in Bulwer’s Chirologia as a demand [for] silence (45). This silencing gesture would traditionally serve a rhetor in quieting an audience so that he or she may speak more easily; for the Kinect, this gesture—when interpreted as a kind of silencing—might indicate that the act of establishing oneself as valid in turn requires the Kinect metaphorically silence itself and await commands of input from a human being. Similarly, the gestures for navigation, a waving of one’s hand to the left or to the right, are described by Bulwer as movements of invitation and dismissal (51-52). Such a description is apt (and reflects also the act of flipping through pages in a book), as the Kinect user calls up and moves through sections of content, inviting new information to the screen and dismissing previously viewed content.

This recognition of the hand’s primacy is significant for an understanding of both the natural user interface and of writing—if we write through the inscription of meaning into or onto the world, then our hands offer us the most familiar and immediate way to do so, since we use those hands as tools to build, make, scratch into, and otherwise compose things (ideas, objects, experiences). In the case of the Kinect, this capability of the hand becomes the criterion by which a particular object’s power is evaluated: if there is no hand to be recognized (so that it may write, through its gestures, the appropriate meaningful signs and actions), there must be no rhetor/user attempting to communicate with the NUI.

emBody(dekaaz) {

Gestural Composition

The conversation surrounding movement and gesture has been largely absent from composition scholarship, likely due to the disciplinary focus on looking through (to use Lanham’s phrase) the production of a given text rather than at the practices of and leading toward that ultimate production. That is, composition as a field has long been more interested in the textual arguments produced through writing than in the physical act of writing itself, although in recent years this has changed due, in part, to a growing interest in multimodality and in electronic forms of writing. We do not suggest that examining composed texts and their processes abstractly is less important than looking at the physical activities of those processes; instead, we emphasize the influence of various inscriptive actions upon the generation and communication of a text. As John Trimbur and Karen Press note, conventionally even the page on which text is written is generally taken for granted as [...] a semiotic zone where writers and readers exchange meanings and identities without the ostensibly pre-literate support of pictorial, verbal, and gestural cues, on their own, as it were, relying solely on alphabetic inscription (94). Sean D. Williams has described this reluctance to examine and question the tendency to accept writing practices as a given or a norm with some humor, saying that composition scholars cling to the idea of writing about [a variety of] representation systems in verbal text because that’s what we do in composition (23-24). Despite the sarcasm, Williams’ main point stands: writing linearly in text is an extremely limited way of exploring subjects that do not involve linear text, and composition’s tradition of adhering to this limitation does not justify it continuing.

That said, several critics have examined the act of writing as a cultural or technological, as well as a productive, mechanism for communication. For example, Kathleen Yancey has examined the technological nature of her parents’ and grandparents’ handwriting educations and its various effects: her grandmother’s writing was calligraphic, while her left-handed father’s handwriting was nearly illegible, and her son wrote exclusively in print rather than cursive (Handwriting 83). As Christina Haas adroitly observes, writing practices, technologies, and their material qualities are often overlooked but become profoundly obvious when changes are introduced, such as when writers move from the heft of the manuscript and the feel of a new Blackfeet pencil, to the bright, wired-up, whirring box and clicking keyboard on the desk (24). As computer keyboard and mouse have become more commonplace, however, it has been the natural user interface and the haptic touchscreen offering more recent recognition of technological materiality, moving the writer away from the desktop metaphor of the GUI and toward the gestural act of writing into and onto a space.

While the textural qualities of physical handwriting may not have been a major focal point for critical inquiry in the field, composition scholars have paid close attention to the physical and analog components (activities as well as interfaces) of writing in and for digital spaces with electronic technologies. Jay David Bolter provides an early awareness of the movement involved in electronic writing and reading in his book Writing Space: The continuous flow of words and pages in the [print] book is supplanted in electronic space by abrupt changes of direction and tempo, as the user interacts with a web page or other interface (12). More recently, Jody Shipka has argued for a need to consider the networked and technological processes involved in creating electronic texts:

when one examines e-mails, online transcripts, screen captures of a Website, or even when one views a video online, it becomes easy to overlook the various resources and complex cycles of activity informing the production, distribution, exchange, consumption, and valuation of that focal text or collection of texts [...] Tracing the processes by which texts are produced, circulated, received, responded to, used, misused, and transformed, we are able to examine the complex interplay of the digital and analog, of the human and nonhuman, and of technologies, both new and not so new. (29-30)

Shipka’s emphasis on the ecological nature of writing, i.e. as complex cycles of activity, is key: by recognizing that writing is movement (and not just one motion-act but a series of networked, often cyclical, movements by multiple agents), we can understanding writing as temporal and dynamic not just in terms of the compositional (inventional) process but in the communicative (delivery) process as well. This idea of writing as active moment runs counter to the historical quality of writing as static and permanent, and Walter Ong provides a description of this perspective, arguing that written text and literacy profoundly and irreversibly alters orality and how individuals think about language: Though words are grounded in oral speech, writing tyrannically locks them into a visual field forever where it becomes immutable and non-discursive (12). While Ong is primarily interested in discussing how writing is a kind of technology, he nonethless continues the Platonic argument against writing (specifically, that writing lacks the back-and-forth quality of oral discourse). It is this line of thought that maintains a connection between gesture and speech rather than between gesture and writing.

Lines and circular and triangular swirls of light are captured in focus, while a man, whose appearance is out of focus, sits in the background.

Figure 6. Man Ray, Space Writing. 1937. Collection SFMOMA. © Man Ray Trust / Artists Rights Society (ARS), New York / ADAGP, Paris.

However, even though many may be inclined to think of gesture as a means of representing and communicating speech—with the most recognizable examples in the form of the existing sign languages used around the world (which have varied relationships with or connections to the structures and conventions of spoken language)—so too can and should we recognize gesture as a means and form of writing, a way of inscribing meaning into and onto the world in which one moves and acts. The light paintings first explored by Man Ray, Barbara Morgan, and Pablo Picasso in the 1930s and 1940s (and since by many others) demonstrate the temporally significant impact that movement can have on a given space. In an acknowledgment of the inscriptive potential of his movements with the electric light brush used in his pieces, Man Ray even titled his series of light paintings as space writing (of which Figure 6 is one example). While each light painting, like any other movement, is momentary and recorded only through some other medium (e.g., photography), its reflection of the inventional processes involved in gestural actualization is no different from that of the written word as a medium of articulating an argument only possible through the act of exploring how best to write it.[3]

Similarly, writing through gesture or movement can be distinguished from writing through literate or linguistic construction, just as both can be distinguished from writing through combinations of physical action, literary text, and other modes of communication. It is precisely these distinctions that suggest exciting possibilities to be realized through novel forms of compositional experimentation. As Dennis Baron has observed, the evolution of writing as a predominant form of communication has been successful specifically because of its differences from the spoken word:

[W]hile writing cannot replace many speech functions, it allows us to communicate in ways that speech does not. Writing lacks such tonal cues of the human voice as pitch and stress, not to mention the physical cues that accompany face to face communication, but it also permits new ways of bridging time and space. Conversations become letters. Sagas become novels. Customs become legal codes. The written language takes on a life of its own, and it even begins to influence how the spoken language is used. (122)

Baron’s point is that writing is not speech and should not be considered simply a representation thereof but rather it is an entirely separate means of making and communicating meaning to an audience. We propose that gesture (and, with it, physical movement) is yet another distinct, not subordinate, way of creating meaningful communication that, much like writing, functions through inscribing messages into the world (via specific rhetorical situations) through its own embodied activity. Because meaningful gesture is almost always happening, whether consciously intentional or otherwise, its inscriptive nature demonstrates the exploratory possibilities of inventional practice.

One predecessor to emBody dekaaz() { is Text Rain, a project that fuses linear text, textual motion, and human physical activity into a dynamic installation of poetic composition and interpretation. Conceived by Camille Utterback and Romy Achituv, Text Rain captures the movements of passersby through a digital camera and then transposes a series of falling lines of text whose characters move at slightly different speeds. As characters come into virtual contact with the outline of an audience member’s body, those characters sit in place for a few moments before fading out of sight. According to Francisco J. Ricardo, Text Rain

presents several discursive spaces or moments of being, revealing themselves in gradual fashion. Patterns emerge from these discursive spaces [...] aspects of the work that interact with the user’s evolving response through a dialogic circle between user and work [...] Text Rain is motivated by a reach for connection, not merely by what emanates in the optic flow of its movements, but also by the text it fragmentarily presents. (Ricardo 58)
A photograph of the Text Rain visualization screen, which displays two individuals with arms stretched out to their sides, digitally holding up some characters from the poem while other characters fall through open space. A third individual walks by behind the other two.

Figure 7. Three individuals interacting with Text Rain.

To an extent, Text Rain is a new version of the flat plane of traditional writing, albeit with the addition of a temporal dimension: the software on which it runs looks only for enough change in object color from its default background space in order to imitate the effect of the raining alphabetic characters contacting some physical object (e.g., the audience—see Figure 7). However, it also reflects an experimentation with the choric space discussed by Arroyo, an attempt to expand the possibilities of meaning-making through the construction of potentiality. The unique movements of each individual’s engagement with Text Rain launch new actualizations in and from the chora, far beyond the scope of a traditional static document. As Ricardo notes, the on-screen evaporation of alphabetic characters upon contact with an individual disrupts and deforms conventional reading: the constituents of multiple lines [of text] appear and vanish together, so that at any moment, one’s reading encounters only a selective sample of various lines (64-65). This selection of lines—lines that emerge only through individualized, dynamic activity in the space constructed by the installation, rather than through a two-dimensional linear representation of spoken language—leads to a novel way of understanding what it means to write through physical moving and being.

So what new practices of invention and expression through writing (via alphabet, gesture, spatial inscription) are possible when extending the chora into a four-dimensional volumetric space?

emBody(dekaaz) { at CCCC 2012

A man faces away from the camera, seeing the shape of his body captured by the Kinect and projected in green text on the wall in front of him. A woman sits nearby on a bench next to the Kinect.

Figure 8a. An attendee of CCCC interacts with emBody(dekaaz) {.

The shapes of multiple passersby are displayed through brightly-colored text, comprised of conference attendees' tweets, on a wall.

Figure 8b. Close-up of multi-colored display of tweets as attendees walk by. (Photograph courtesy of Kerri Bright Flinchbaugh.)

At the Sixty-Third Conference for College Composition and Communication in the spring of 2012, a Microsoft Kinect sensor and a projector were placed on opposite sides of a wide, well-trafficked hallway on the second floor of the St. Louis Convention Center.[4] The Kinect was placed near the ground against a long, high wall between two benches (see Figure 8a). A USB cable, secured and hidden by black gaffers tape ran along the floor from the Kinect to the opposite side of the hall, disappearing under black fabric that was draped down and hiding the laptop on the middle shelf of a 6' tall platform. The cable was connected to the laptop, which was connected to the projector that stood atop the platform.

A photograph of a figure whose head and torso, along with parts of the room behind it, are dotted by hundreds of small white spots of light.

Figure 9. Audrey Penven, Steen, Made of Dots. © Audrey Penven, 2010. (CC-BY-NC-SA 2.0.) A photograph of the placement of infrared dots from Kinect sensor contacting a figure.

As conference attendees walked by the Kinect, they passed unaware through a 16'-deep, 8'-wide zone of hundreds of thousands of invisible, infrared dots that bespeckled them from head to toe. The dots covered everything—from hair to hands, bags to books, coats, pants, and shoes (see Figure 9). The speed and angle at which those hundreds of thousands of dots bounced back to the Kinect determined the distance of some part of some body or object from the sensor. And the 307,200 numerical values comprising all of the distances or depth values for a 640x480 image frame were sent across the hall along the USB cable thirty times each second (such as the image photographed in Figure 8b). As each second passed, over nine million depth values passed across the hallway to the laptop hidden behind black muslin. As an example of the Kinect’s powers as an inscription machine, during the thirty minutes between sessions in which approximately 8000 prepared words were spoken by three or more presenters, over sixteen trillion values streamed under foot and gaffers tape to the software program processing it all.

On the 20' high wall behind the Kinect, on which the output from the software program was projected, tweets from #dekaaz were used to generate a real-time, textual embodiment of anything passing through the infrared zone. If an attendee walked past the Kinect, a tweet-based, textual facsimile of his or her profile followed like a shadow along the wall. If the attendee stopped to engage directly with the project, his or her posture and gestures would be depicted textually in real-time (see Figure 8a).

A Multi-Layered, Hypertextual Mash-up

In the videos and images of the output from emBody(dekaaz) {, the projected text is multi-colored. The reason for this relates to the virtual, choragraphical surface of writing described in the introduction. Each color corresponds to one of ten slices of three-dimensional depth from the infrared zone in front of the Kinect, and along each slice, one of the ten most recent tweets from #dekaaz are written.

Each slice is .35 meters (1.1') in thickness. If one or more participants stand across more than one “slice,” the output on the wall will be a multi-colored mash-up of two or more tweets mentioning the hastag #dekaaz. If a participant runs toward and away from the sensor, he or she will have traveled across all of the slices of depth. The output on the wall will, in turn, display a cascade of color changes and different tweets.

About #dekaaz

#dekaaz was set up by Rachel Bagby to accompany her featured presention at the conference. Bagby is a singer, writer, and speaker who attended CCCC to offer a workshop on a three-stanza poetic form that she calls dekaaz. In an excerpt from her website, Bagby introduces the form as follows:

Put simply, Dekaaz is a new form of poetic expression – like a modern-day incarnation of the ancient haiku put to work in everyday life.

Each Dekaaz has ten syllables in three lines:
2 syllables in the first line
3 syllables in the second
5 syllables in the third

And your creation of Dekaaz isn’t complete until you speak it out loud. Beyond that, there are many playful arrangements + powerful a-has! It’s like wisdom jazz, for the mind.

At the workshop, Bagby explained that her poetic form is meant to focus the mind and body on the core feelings and meanings in our lives. A few of the tweets with the hashtag #dekaaz composed during the conference include the following:

I see ~ Light in you ~ Even when you don’t
I am ~ in awe of ~ what is possible
circles ~ spiraling ~ heaven dances earth
seagulls ~ red sky high ~ waves pushing higher

In addition to the above-cited introduction to the form, Bagby also explains that Dekaaz is a powerful tool for synthesizing complex ideas into succinct, crystalline poetry. It’s designed to allow non-poets, non-writers + non-speakers to a-r-t-i-c-u-l-a-t-e the absolute core of their meaning, in the moment. With speed. And with pleasure (Dekaaz,. Echoing these points, in a blog post published after attending the workshop at CCCC, Melissa Miles McCarter offers the following description of the form’s value for composition studies:

What I like about dekaaz is its function in invention.  Part of composition studies’ focus is on how we come up with ideas, what in this field is called invention. Writers often refer to this invention process as the Eureka experience. However, as anyone who struggles to come up with an idea knows, this out of nowhere process isn’t very reliable. So, I like methods that allow for coming up with ideas in a more predictable fashion, such as in brainstorming, free-writing, or in starting with a topic.  Dekaaz is another method that provides sparks for new ideas...

In addition to all of these reasons for valuing dekaaz are two more, which dovetail serendipitously with the computational dimensions of our emBody(dekaaz) { project. First, the 2-3-5 syllabic form is part of the Fibonnaci sequence; so, a dekaaz’s Eastern and African wisdom is also tied to the Western wisdom of φ (phi) and the golden spiral. Second, the Fibonnaci sequence resembles a well-known Oulipean constraint known as a boule de neige or snowball. In the Fibonacci sequence, each integer in the sequence is the sum of the previous two; so, 3 is the sum of 2 + 1, and 5 is the sum of 3 + 2. The Oulipo, who were known for their experimentatal mash ups of mathematics and language, developed a form of writing, known as a boule de neige or snowball, that resembles the Fibonacci sequence. The traditional Oulipian snowball does not add the last two numbers; rather, it more simply adds one more element to the last. In the following example, each line is comprised of a word one letter longer than that of the preceding line:

I
am
the
text
which
begins
sparely,
assuming
magnitude
constantly,
perceptibly
proportional,
incorporating
unquestionable
incrementations
(Mathews 226)

Diverging from this baseline, snowballs come in a variety of forms (e.g., increasing or decreasing between lines based on syllable or word count). A snowball based on the Fibonacci sequence would be a novel example, and Bagby’s dekaaz seems to be a form that bridges the two.

Writing, Gesturing, Connecting, Kinect-ing

Kinect-ing Together Writing and Gesture Through NUI Technologies

As we mentioned in the introduction, one of the challenges for digital writers and writing theorists is to focus on the long-lost surface or substrate of writing. Within the paradigm of print-based, alphabetic writing, the conventional surface or substrate of lettered writing was reduced to an meaningless backdrop, a neutral medium. Unconventionally, the surface has been explored as an important source of invention in the past century. Beginning with Stéphane Mallarmé’s Un Coup de Dés Jamais N'Abolira Le Hasard (a roll of the dice will never abolish chance), and leading up to and beyond the multimodal experiments with which some compositionists have been involved, the surface has been a source of inventional value.

A two-page spread excerpt from Stephane Mallarmé's Un Coup de Dés. Short lines of text are placed randomly across the two pages.

Figure 10. Two consecutive pages from Mallarmé’s Un Coup de Dés.

But so long as it is meant to support readable writing, i.e, literacy, transformations of that surface can only go so far before the text returns to its broader origins, such as drawing and othe line-making pratices, as does the text in emBody(dekaaz) {.

Scoring the Surface

In order to both explain how we developed the surface of writing for our project and offer one example of the ways in which digital writing will be challenged to step far outside of convention in order to engage with NUI surfaces, what following is a more technical explanation of how we scored up the three-dimensional surface of the zone in front of the Kinect.

  1. To begin with, thirty times each second, the Kinect sensor sends out 307,200 numerical values that describe the depths at which each of the infrared dots bounced back from the zone in front of the sensor. The 1 x 307,200 values are a flattened 640 x 480-pixel image of the scene. In essence, it is a series of rows, running from one to the next in a long line, whose components have been reconstructed into a two-dimensional image.
  2. Using two software loops, we generated a lower-resolution image of the original one by reading through every eighth value in each column (640) of each row (480). Our new depth image was 80 x 60, or 4800 values. We did this for two reasons: 1) the lower-resolution image meant that the software progam would not be over-burdened by too much processing ; 2) each letter took up the equivalent of 8 pixels of horizontal and vertical space.
  3. Each of the depth values assigned to the 80 x 60 image could could have a value between .5 meters (1.6') and 4 meters (16') from the sensor. In order to create the experience of walking through a series of virtual layers of text, we sliced the 3.5 meter (15.5') continuum of depth into ten .35-meter thick (1.1) planes.
  4. Each plane was then assigned a unique color and one of the ten most recent tweets using the #dekaaz hashtag.
  5. Since a #dekaaz tweet might be anywhere from 30 or 50 to 140 characters in length, and the surface of one of the planes is 4800 points/characters in size, each tweet was repeated as many times as was needed to cover its entire plane. For example, the above-cited tweet, circles ~ spiraling ~ heaven dances earth, which is 41 characters long (with spaces), would be repeated 117 times in order to cover the entire 80 x 60 plane.
  6. When a depth value at one of the 80 x 60 points in our lower-resolution grid changed because someone or something had shifted position, the new depth value would be reconciled with the letter from the tweet assigned to its repeated position in the appropriate data plane.

For digital writers and writing theorists, the thought and preparation that went into the surface of this project, a combination of ten dynamically-constructed and overlaid planes in a virtual three-dimensional space, is a dramatic change from the ways in which the single planar surface of a Microsoft Word document is engaged. But this is part of what we’ve recognized as one of the key changes on the not-so-distant horizon of digital writing.

Conclusions / Provocations

For digital writers interested in the future(s) of post-desktop computing, NUIs and other new paradigms of computational engagement challenge us to rethink what writing is and, more importantly, what it does. Boldly stated, the representational ends of the alphabet comprise part of only one trajectory for digital writing’s futures. One reason is that to the extent to which we radically engage with the surface of writing, the farther away we move from the alphabetic mandate. Alphabetic writing in its print-based form derives its representational value from the neutrality of the surface on which it is presented. This idea of representation is separate from the act of writing (i.e., inscription) itself, with an emphasis on the physical activity rather than the produced text. In NUI media such as that of the Kinect, the surface (of writing) is inextricably linked to the body, to gestures and embodied movement. Alphabetic writing is not designed for this kind of dynamic substrate—but other types of writing are.

Why does it matter that we attempt here to define writing so broadly or that we stress the nature of gesture (or other physical movement) as a form of writing? In part, it matters because, as the New London Group and others since have suggested, we as writing scholars should attend to the wide variety of ways in which all sorts of writers construct and communicate meaning with one another; using alphabetic characters is only one of many means at our disposal. Since gesture, like speech, is to a large extent temporal—i.e., it disappears or erases itself once it is enacted or uttered—it has generally been overlooked by scholars of writing and composition as a form of meaningful inscription worthy of critical inquiry. However, thanks to digital technologies like the Kinect, we can record, interpret, and remix physical activity once it is translated into points of digital data. However, we have argued that our definition of writing here is not broadening as much as that writing has traditionally been defined too restrictively; we are not saying that everything is writing; rather, there is more to writing than the formation of strings of alphabetic characters.

What we suggest, then, is an exploration of the possible ways that we can compose new types, forms, and genres of writing through digital inscription—inscription interpreted meaningfully by human and machine audiences alike, albeit in ways that are, and have been, radically differing. However, it is in that space of difference where the most interesting work could occur: how are culturally significant gestures reimagined and reconstituted when algorithmically calculated, modified, and thus expressed? how might new metaphors emerge from physically-demanding natural user interfaces? how might our understanding of writing further change alongside avenues for communication that call attention to gestural behavior as something which, like alphabetic writing, is more and other than a way to represent and transmit spoken language?

Notes

  1. In his essay titled Spinoza and Us, Deleuze argues that when we let go of our overly-organized selves—when we forgo our tendency to define a thing by its form . . . its organs and its functions, [or] as a substance or a subject (127), we may find ourselves engaged in an ethological exploration of time, space, and affect, which would install us along the same modal plane to which Spinoza was devoted; therefore, we will be in the middle or midst of Spinoza. [Return to text.]

  2. In emBody(text) {, the Heraclitean space-time with which we worked was based on ten two-dimensional planes (see the section A Multi-Layered, Hypertextual Mash-up). Anticipating the counter-argument that we’re merely working with more than one two-dimensional surface, i.e., we haven’t moved beyond two-dimensional writing, our rebuttal is that the space on which our project is based are the bodies in front of the Kinect. The ten planes were our way of adding some resolution (bod/ies10) to the embodied movement with which we were ultimately engaged. [Return to text.]

  3. Extending Sarah Arroyo’s argument in Choric Swipe to Man Ray’s space writing, Arroyo would probably argue that Man Ray’s light paintings should be retitled topos writing because the flat surface on which he writes does not bring out the choric dimension. We agree, and would argue that our project is an example of how choric writing can be explored in digital media. [Return to text.]

  4. We would like to offer our thanks to Chris Anson and Rachel Bagby for the opportunity to display our project at CCCC 2012. Their support and participation were integral to the success of emBody(dekaaz) {. [Return to text.]

Works Cited

  • Arroyo, Sarah J. The Choric Swipe. YouTube. 18 May 2012. Web. 1 Aug. 2013. <https://www.youtube.com/watch?v=gCjTHU_IzSE>.
  • Bagby, Rachel. Dekaaz. rachelbagby.com. Web. 6 Sept. 2013. <http://rachelbagby.com/dekaaz/>.
  • Baron, Dennis. From Pencils to Pixels: The Stages of Literacy Technologies. Computers in the Composition Classroom: A Critical Sourcebook. Eds. Michelle Sidler, Richard Morris, and Elizabeth Overman Smith. Boston: Bedford/St. Martin’s, 2008. 116-134. Print.
  • Bolter, Jay David. Writing Space: Computers, Hypertext, and the Remediation of Print. 2nd ed. Mahwah, NJ: Lawrence Erlbaum Associates, 2001. Print.
  • Bulwer, John. Chirologia: or the Natural Language of the Hand, and Chironomia: or the Art of Manual Rhetoric. Ed. James W. Cleary. Carbondale: Southern Illinois UP, 1974. Print.
  • Burke, Kenneth. Language as Symbolic Action: Essays on Life, Literature, and Method. Berkeley: U of California Press, 1968. Print.
  • Cage, John. Manuscript Excerpt: 4'33". 1952. Web. 5 Aug. 2013. <http://exhibitions.nypl.org/johncage/node/218>.
  • Carnegie, Teena A.M. Interface as Exordium: The Rhetoric of Interactivity. Computers and Composition 25.4 (2008): 396-415. Print.
  • De Landa, Manuel. Intensive Science and Virtual Philosophy. New York: Continuum International, 2006. Print.
  • Deleuze, Gilles. Spinoza: Practical Philosophy. Trans. Robert Hurley. San Francisco: City Lights, 2001. Print.
  • Derrida, Jacques. Of Grammatology. Trans. Gayatri Chakravorty Spivak. Baltimore: Johns Hopkins UP, 1997. Print.
  • Flusser, Vilém. Does Writing Have a Future? Trans. Nancy Ann Roth. Minneapolis: U of Minnesota Press, 2011. Print.
  • Flusser, Vilém. The Gesture of Writing. Flusser Studies 8 (2009): 1-14. Web. 12 Sept. 2013.
  • Haas, Christina. Writing Technology: Studies on the Materiality of Literacy. Mahwah, NJ: Lawrence Erlbaum Associates, 1996. Print.
  • Harris, Roy. The Origin of Writing. La Salle, IL: Open Court, 1986. Print.
  • Kotz, Liz. Words to Be Looked At: Language in 1960s Art. Cambridge, MA: MIT Press, 2010. Print.
  • Kress, Gunther. Multimodality: A Social Semiotic Approach to Contemporary Communication. New York: Routledge, 2010. Print.
  • Lanham, Richard. The Electronic Word: Democracy, Technology, and the Arts. Chicago: U of Chicago Press, 1993. Print.
  • Leroi-Gourhan, André. Gesture and Speech. Trans. Anna Bostock Berger. Cambridge: MIT Press, 1993. Print.
  • Mallarmé, Stéphane. Un Coup de Dés Jamais N'Abolira Le Hasard. Bruges: Imprimerie Sainte Catherine. 1914. Web. 4 Sept. 2013. <http://writing.upenn.edu/library/Mallarme-Stephane_Coup_1914_spread.pdf>.
  • Man Ray, Space Writing. 1937. Collection SFMOMA. © Man Ray Trust / Artists Rights Society (ARS), New York / ADAGP, Paris. Web. 21 Aug. 2013. <http://www.sfmoma.org/explore/collection/artwork/12757>.
  • Mathews, Harry. Snowball. The Oulipo Compendium. Eds. Harry Mathew and Alastair Brotchie. London: Atlas, 1998. 226. Print.
  • McCarter, Melissa Miles. Learning About Dekaaz. Open Salon. 24 Mar. 2012. Web. 3 Aug. 2013. <http://open.salon.com/blog/lissahoop/2012/03/23/learning_about_dekaaz>.
  • Microsoft. Gallery | Microsoft Kinect for Windows. microsoft.com. 18 Mar. 2013. Web. 22 Aug. 2013. <https://www.microsoft.com/en-us/kinectforwindows/discover/gallery.aspx>.
  • New London Group, The. A Pedagogy of Multiliteracies: Designing Social Futures. Harvard Educational Review 66.1 (1996): 60-92. Print.
  • Ong, Walter. Orality and Literacy. London: Routledge, 2002. Print.
  • Penven, Audrey. Steen, Made of Dots. Flickr.com. Photograph. 10 Nov. 2010. Web. 4 Sept. 2013. <https://secure.flickr.com/photos/audreypenven/5183650899>.
  • Ricardo, Francisco J. Reading the Discursive Spaces of Text Rain, Transmodally. Literary Art in Digital Performance: Case Studies in New Media Art and Criticism. Ed. Francisco J. Ricardo. New York: Continuum, 2009. 52-68. Print.
  • Rieder, David M. From GUI to NUI: Microsoft’s Kinect and the Politics of the (Body as) Interface. Present Tense: A Journal of Rhetoric in Society 3.1 (2013). Web. 16 August 2013.
  • Selfe, Cynthia and Richard Selfe. The Politics of the Interface: Power and Its Exercise in Electronic Contact Zones. College Composition and Communication 45.4 (1994): 480-504. Print.
  • Shipka, Jody. Toward a Composition Made Whole. Pittsburgh: U Pittsburgh Press, 2011. Print.
  • Sirc, Geoffrey. English Composition as a Happening. Logan: Utah State UP, 2002. Print.
  • Studdert-Kennedy, Michael. The Phoneme as a Perceptuomotor Structure. Language Perception and Production: Relationships Between Listening, Speaking, Reading, and Writing. Ed. Alan Allport. London: Academic Press, 1987. [67-84]. Print.
  • Trimbur, John and Karen Press. The Page as a Unit of Discourse: Notes Toward a Counterhistory for Writing Studies. Beyond Postprocess. Eds. Sidney I. Dobrin, J.A. Rice, and Michael Vastola. Logan, UT: UT State UP, 2011. 94-113. Print.
  • Williams, Sean D. Part 1: Thinking Out of the Pro-Verbal Box. Computers and Composition 18.1 (2001): 21-32. Print.
  • Yancey, Kathleen Blake. Handwriting, Literacy, and Technology. On the Blunt Edge: Technology in Composition’s History and Pedagogy. Ed. Shane Borrowman. Anderson, SC: Parlor, 2012. 72-84. Print.
  • Yancey, Kathleen Blake. Made Not Only in Words: Composition in a New Key. College Composition and Communication 56.2 (2004): 297-328. Print.
  • Young, La Monte, ed. An Anthology of Chance Operations. New York: La Monte Young & Jackson Mac Low, 1963. Print.