April 24, 2002: Features


Perry Cook is pushing forward the frontiers of digital audio

By Caroline Moseley

Perry Cook will put sensors and speakers on almost anything, from a didgeridoo (an Australian aboriginal wind instrument carved from a tree branch), left, to a tumbler, maraca, accordion, or coffee mug.

(Photographs: Frank Wojciechowski; type design: steven veach, spectogram: A. K. Peters)

Photo below: In Cook’s lab, Cook, left, plays what he calls the “digitaldoo” while senior Ajay Kapur strokes the electronic Indian tabla he created for his thesis.

Spectogram, below: Cook’s research shows that striking a prayer bowl at the same location on the rim, but at different angles, produces different sound spectra.

Perry Cook wants us to listen. Just listen, to pay as much attention to what we hear as to what we see, to appreciate creative use of sound as fully as we respond to creative visual effects. In his own explorations of the nature of sound, the associate professor of computer science — who has a joint appointment in the music department — has synthesized some of the more astonishing acoustic phenomena heard on the Princeton campus. If, for instance, you have ever wondered what a trumpet would sound like if it were 30 feet long, he’s your man.

Cook’s research focuses, he says, on “creating algorithms – step-by-step procedures – for sound synthesis. Voice synthesis, instrument synthesis, and sound effects synthesis are all part of one thing – modeling sound production in the computer.” So for the 30-foot trumpet, the first step is to enter into the computer equations that describe the physics of sound waves inside a regular trumpet. Next is to alter the parameters of the model to make the trumpet a 30-footer. Finally, the question: How to “play” the new virtual trumpet?

For Perry Cook, this is where the fun starts. Forget about simply using a mouse or keyboard. “Synthesis algorithms enhance our palette of sounds,” he says, “because we can model any sound. Then we need more and better controllers to enhance the human performer’s ability to manipulate those sounds.”

In Cook’s lab in the computer science building, there is the usual infrastructure of a recording studio: mikes, recording machines, speakers, a mixer, lots of equipment with lots of dials. Then there are some items you don’t usually see. Wired to his computer are unexpected controlling mechanisms. Look at the pressure-sensitive floor, for example, which has panels of gravel, carpet, tile, and ersatz grass. A footstep elicits the actual sound of a foot on gravel, carpet, tile, or grass – but with Cook in charge those steps can also produce the sound of a maraca, a tambourine, or someone walking on sticks. “It’s the same algorithm,” he says. “It’s a parametric synthesis algorithm, which means it has a few knobs I can turn that will change the sound from being a maraca to being wind chimes. Nothing is changed except the numbers that control what it does.”

So, he says, warming to the demo, and producing the sounds he describes, “Here’s a maraca. I can change it from being a maraca with one bean inside, to a maraca with half a bean, to a maraca with a whole lot of beans. In addition, when beans hit each other in real life, they lose energy; in this system, they don’t have to.”

In stretching the boundaries of his algorithms, Cook says he is creating “augmented reality. Virtual reality means I put helmet and goggles and earphones on you and try to convince you you’re in a world that doesn’t exist. Augmented reality means I take a real-world object – like the floor – put sensors into it, and give you an enhanced version of that real interface.”

He holds up a coffee mug with sensors attached to its exterior; it is sensitive to pressure and to tilt. “This is a very real object,” he says, “You could drink coffee out of it if you had to. But it does more than a coffee mug could ever do” — a claim he demonstrates by “playing” an entire Latin percussion band by manipulating the mug. “I like to look at everyday objects as potential computer interfaces,” he says. “Instead of a mouse, or a keyboard, why couldn’t we use a spoon, a fry pan, a violin, or” — he gives the mug an approving pat — “a coffee mug?”

No surprise that the Princeton course closest to Cook’s heart is one he created, Human-Computer Interface Technology (HCIT), cross-listed with Computer Science and Electrical Engineering, which he has taught every fall since joining the faculty in 1996. “It’s about how people and computers hook up,” he says. “The most common human-computer interface is still the keyboard or the mouse, but we look at the possibility of other devices that will allow people to operate computers, inputting information through voice, handwriting, or gestures, for example.” The course also “examines technologies that assist handicapped persons in using computerized devices, and that allow artists to input and output illustrations, graphics, and music.”

An exciting area of use for sound synthesis, Cook believes, is “enhancing and maybe reinventing traditional computer-human interfaces. Sonic signals could be used to indicate where the cursor is located on the screen,” he suggests, “and auditory feedback might aid sight-impaired users in finding the scrollbar or other tools.”

HCIT student Jacob Weiss ’03 found that Cook “encourages students to pursue their own creative ideas, in both music and engineering.” Weiss, a member of the Princeton Juggling Club, worked with Cook to design and build “JuggleMIDI” – an electronic (MIDI) instrument that plays music controlled by Weiss’s juggling. “The device uses a pressure sensor on my palm that detects when a ball is thrown,” explains Weiss, “and it plays a note on every throw. There are also two sensors on my legs that determine which note will be played, based on the position of my legs when I throw the ball. The sensors connect to a microprocessor in my pocket that outputs signals to the synthesizer.”

Translated, that means Weiss can play “Hey, Jude” not just while juggling, but actually by juggling — making him a hit at the Juggling Club’s February show.

Among other courses Cook has taught are Transforming Reality by Computer, which instructs students in “capturing and transforming sound by computer for artistic purposes,” and a freshman seminar called Techno Music I: 100,000 BC—1999, “an attempt to survey the influence of technology on music throughout history.” Cook points out that the use of a stone tool to chip a hole into a bone, creating an early flute, could have been just as dramatic an invention as today’s turntable, guitar amp, or synthesizer.

Cook holds what the computer science department and music department both believe to be the only such joint appointment in the country. Says David P. Dobkin, chair of computer science, “Perry is right at the edge of a lot of disciplines. His research is giving us all a better understanding of acoustics. He is reshaping the way computers imitate musical instruments, by looking at how the instruments make the sounds.” Professor of Music Paul Lansky, a composer of electronic music who has incorporated Cook’s synthesized flute in his compositions, calls Cook “a pioneer in physical modeling, that is, teaching the computer how sound works, actually modeling the physical characteristics of sound.

“There are very few people,” he says, “who combine his level of engineering expertise and musical expertise.”

Cook regularly advises on theses and dissertations in the music department as well as in computer science. Daniel Trueman *99 worked with Cook and Lansky on his dissertation, “Reinventing the Violin,” which Trueman describes as “an exploration into the relationship between the design of the violin and the music it produces.” The last chapter, he says, “focused on an instrument I designed and built with Perry’s assistance, the Bowed Sensor Speaker Array (BoSSA).” Trueman, now assistant professor of music, composer in residence, and director of the Digital Music Studio at Colgate University, credits Cook with being “one of the least didactic and most entertaining teachers I’ve ever had. He has a way of making difficult concepts clear.

“He once helped me understand how vibrato helps a solo violinist stand out from the orchestra: Imagine a bunny in the forest; if the bunny stays still, you probably won’t notice him (non-vibrato), but if he moves (vibrato), he grabs your attention. Variation creates distinction. I had never thought of it that way before.”

Cook, Trueman, and Curtis Bahn *98, now assistant professor of music at Rensselaer Polytechnic Institute, occasionally perform in an ensemble called Interface. All three play sensor-augmented musical instruments. Time Out New York once (enthusiastically) reviewed an Interface concert as “sounding like flames igniting a fuzzy nylon carpet or someone munching a mouthful of needles.”

An undergraduate Cook advisee, computer science major Ajay Kapur ’02, is creating electronic tabla as his senior thesis. “Tabla are hand drums traditionally used to accompany North Indian vocal and instrumental music,” Kapur explains. “They are unique because the drumheads have weights at the center, and make many different sounds, depending on the way they are stroked.” The stroking technique, he says, “follows a tradition passed from generation to generation, from guru (teacher) to shikshak (student) in India. The combination of the weighting of the drumhead, and variety of strokes, gives the drum a complexity that makes it a challenging controller to create, as well as a challenging sound to simulate.”

Kapur has found Cook “really dedicated to his students. I was having trouble hooking up a controller; I walked into the lab and found Professor Cook taking off the panels on the floor, crawling into all the wiring, and recircuiting the computers so the input would work for my experiment. That was well above the call.”

ook did not set out to become a professor of computer science, and, in fact, has never taken a computer science course. His early ambition was to become “a professional musician – jazz trombonist was one early dream.” A baritone who once was soloist with the California Bach Society and today sings with the Trinity Church Men’s and Boy’s Choir in Princeton, Cook has been singing “as long as I can remember.” In his hometown of Blue Springs, Missouri, “I was a little Baptist boy singing in the choir. I did music and drama all through school.” Thanks to a scholarship, he attended the University of Missouri, Kansas City (UMKC) Conservatory of Music, majoring in voice and trombone.

“The conservatory also had an electronic music studio,” he explains. “At that time there were no computers, but there was a big Moog synthesizer.” Soon, Cook found his attention diverted from vocal training “to playing with the synthesizer and recording. I’d always been a hobby recorder; I’d take my tape machine around, collect sounds, try to make new sounds out of them.

“From the minute I walked into the studio I knew I wanted to do that kind of work forever. I was going to work with sound.” During and after college he worked as an audio consultant, designing and installing sound systems for local enterprises such as the Worlds of Fun/Oceans of Fun theme parks in Kansas City. “I was a roadie,” he says, “except that I wasn’t on the road.”

Soon, Cook realized, “Something was missing. I didn’t have the engineering expertise I needed.” So he returned to UMKC and earned a second bachelor’s degree, this time in electrical engineering. He continued on to Stanford University to earn his doctorate in electrical engineering, working primarily in Stanford’s Center for Computer Research in Music and Acoustics on voice and instrument synthesis. He was acting director of CCRMA before coming to Princeton.

“All I ever wanted to do with engineering was to know more about signals and the physics of sound, so I could work with music,” he says. “Never did I think, ‘Gee, medical imaging is interesting, I want to do brain scans.’ It was always music.”

Since he has been at Princeton, Cook has extended his research from voice and instrument synthesis – “Although there is still a lot to be done there” – to sound effects synthesis. His Real Sound Synthesis for Interactive Applications, to be published by A. K. Peters in June, “looks at the different kinds of sounds we might want to make, and how we might build models of them.”

Cook is “distressed at the disparity between how much computers are used in graphics, for producing movies and games, for example, and how much computers are used for audio – I mean, using the computer not just to store and play back sound, but to create sound. Sound isn’t produced with anything like the sophistication applied to graphic effects.”

Most sound effects, even in animated films, he says, are still created with Foley techniques (named for Jack D. Foley, an early radio sound-effects technician, who claimed to have walked 5,000 miles in the studio doing footstep sounds). “A Foley artist might put shoes on her hands or feet, and ‘walk’ in a box of cornstarch to simulate walking in snow or crawling in the sand,” says Cook. “Many other sound effects are added to soundtracks using similar techniques, actually acting out the sounds.”

Cook, however, has developed “an entire system for the analysis and synthesis of walking sounds.” His new book, which comes with a CD, includes “Bill’s GaitLab” walking synthesis controller, which can produce the sound of “walking on gravel, leaves, snow, mud, whatever, by moving sliders on a graphical user interface.”

He is particularly anxious to secure computational sound a place in animation production and computer gaming. Cook hopes that “animation companies like Disney, Dreamworks, and Pixar will eventually sustain a group whose job it is to think about sound. My goal is to subvert their paradigm, and divert them from graphics to audio, or at least convince them they should be doing more audio.

“It’s of scientific and artistic interest to have a model inside the computer that is flexible and allows you to create sounds you could never control in the real world. And I happen to think it would be cheaper.”

In addition to the CD of sound samples, Cook’s new book includes the computer code for generating all the sounds. “It’s all the tools anyone needs to create almost any kind of sound effects, right there,” Cook says. “It’s my big evangelical push.”

In line with this mission, he spent spring break at the Game Developers Convention in San Jose, California, “just to see how they tick. And to try to get them interested in sound.” He is also a regular speaker at the annual meetings of the Association for Computing Machinery’s Special Interest Group on Graphics, where “I’m kind of the token sound person.”

Electrical engineer, concert baritone, computer programmer, audio consultant, winner of the Princeton Engineering Council’s 2001 Distinguished Teaching Award, brass player, and confessed “big-time tinkerer,” Cook is always up for further sonic adventures. Those present at the September 29 dedication of the Friend Center for Engineering Education were treated to a screening of Breakthroughs in Computer Science: How to Build the Friend Center in Four Minutes, directed by David P. Dobkin, with music by Perry R. Cook. A camera in a fourth-floor window overlooking the Friend Center had been set to take a frame every few seconds for 20 months. A marvel of time-lapse photography, the finished film shows the Friend Center, from pre-hole in the ground to splendid new edifice, in four well-edited minutes. The film score by composer Cook incorporates computer drum machines, appropriate musical quotes from Pink Floyd’s “The Wall” – and vocalist Perry Cook singing “That’s What Friends Are For.”


Princeton-based writer Caroline Moseley sings alto in the University Chapel Choir.

Listen to and learn more about Perry Cook’s work at a Web page he designed to accompany this article: http://www.cs.princeton.edu/~prc/PAW02.html

Current Issue    Online Archives    Printed Issue Archives
Advertising Info    Reader Services    Search    Contact PAW    Your Class Secretary