THE UNIX ORAL HISTORY PROJECT
RELEASE.0, THE BEGINNING
Edited and Transcribed by
Michael S. Mahoney
Mahoney: The origins of Unix® are part of computing lore, and the basic story has been told many times. The Unix Oral History Project is designed to capture it in the words and voices of the people who created the system and who oversaw its development and dissemination within Bell Laboratories. While the participants speak of technical matters, they also reflect on the environment that both fostered Unix and in turn was shaped by it, as it evolved from an operating system to a way of thinking about computing. I'm Michael S. Mahoney, an historian of science at Princeton University, and what follows are excerpts from my conversations with some of the main figures in the early history of Unix. It began in the late spring of 1969 as a file system. Ken Thompson, who designed it with Dennis Ritchie and Rudd Canaday, had been using the Labs's GE 635 to explore ideas about a disk-based file system which would allow several users to work simultaneously without interfering with one another. Thompson tells how this line of inquiry led to an actual system:
Thompson: It was never down to a design to the point of where you put the addresses and how you expand files and --you know-- things like that; it was never down to that level. I think it was just, like, one or two meetings. Dennis and Canaday and myself were just discussing these ideas of the general nature of keeping the files out of each other's hair and the nitty-gritty of expanding, of the real implementation: where you put the block addresses, where you put .... And I remember we did it in Canaday's office, and at the end of this discussion, Canaday picked up the phone, and there was a new service at Bell Laboratories that took dictation. You call up essentially a tape recorder and you give notes, and then the next morning the notes are typed and sent to you. The next day these notes came back, and all the acronyms were butchered, like "inode" was "eyen...". So we got back these ... descriptions and they were copied, and we each had copies of them and they became the working document for the file system -- which was just built in a day or two on the PDP-7.
Mahoney: The PDP-7, the famous graphics machine; when you found that, you had it in mind to just put the file system on there?
Thompson: At first, yes. We'd used it for other things, you know, the famous Space Travel game, and it was the natural candidate as the place to put the file system. When we hacked out this design, this rough design of the file system on the dictation [machine] that day in Canaday's office, I went off and implemented it on the PDP-7.
It didn't exist by itself very long. What we did was -- to run the file system you had to create files and delete files and read and write files and see how well it performed. To do that, you needed a script of what kind of traffic you wanted on the file system, and the script we had was paper tapes that said -- you know-- "read a file", "create a file", "write a file", .... And you'd run the script through the paper tape and it would rattle the disk a little bit, and you wouldn't know what happened. You just couldn't look at it, you couldn't see it, you couldn't do anything. So we built a couple of tools on the file system to ---we used the paper tape to load the file system with these tools, and then we would run the tools out of the file system; that's called exec, by the way, --- and type at these tools, and that was called the shell, by the way-- to drive the file system into the contortions that we wanted to measure; how it worked and reacted. It only lasted by itself for maybe a day or two before we started developing the things we needed to load it.
Mahoney: At what point did you feel that you had something here?
Thompson: Well, the first one was not at all multiprogrammed; it was almost like subroutines on the file system. The read call, the system read call, was in fact the call read of the file system, and it was very synchronous, just subroutine call to the file system for these applications. There was a very quick rewrite that admitted that it was an operating system and that it had a kernel user interface that you trapped across. Uh, I really can't remember what the realization was. I mean, the whole time span from initially starting with --- walking down there with the idea we were going to build a file system ...
Mahoney: When was this? Do you remember?
Thompson: Yeh, it was the summer of '69. In fact, my wife went on vacation to my family's place in California to visit my parents; we just had a new son, born in August '68, and they hadn't seen the kid, and so Bonnie took the kid to visit my family, and she was gone a month in California. I allocated a week each to the operating system, the shell, the editor, and the assembler, to reproduce itself, and during the month she was gone, which was in the summer of '69, it was totally rewritten in a form that looked like an operating system, with tools that were sort of known, you know, assembler, editor, and shell -- if not maintaining itself, right on the verge of maintaining itself, to totally sever the GECOS connection.
Mahoney: So that you could work directly on it.
Thompson: Yeh, and from then on it kept coming up on us.
Mahoney: So you're talking about months.
Thompson: Yeh, essentially one person for a month.
Mahoney: The first versions of Unix were written in assembler, but it had been Thompson's intention from the start that, like Multics, it would eventually be written in a high-level language. Dennis Ritchie, who collaborated on the file system, contributing the concept of treating I/O devices as files, designed that language. Here he talks about its evolution.
Mahoney: You designed C.
Ritchie: Yeh, although not from scratch. It was an adaptation of B; that was pretty much Ken's. B actually started out as system Fortran. Ken one day said the PDP-7 Unix system needed a Fortran compiler in order to be a serious system, and so he actually sat down and started to do the Fortran grammar. This was before yacc; he actually started in TMG. ... Anyway, it took him about a day to realize that he didn't want to do a Fortran compiler at all. So he did this very simple language called B and got it going on the PDP-7. B was actually moved to the PDP-11. A few system programs were written in it, not the operating system itself, but the utilities. It was fairly slow, because it was an interpreter. And there were sort of two realizations about the problems of B. One was that, because the implementation was interpreted it was always going to be slow. And the second was that, unlike all the machines we had used before, which were word-oriented, we now had a machine that was byte-oriented and that the basic notions that were built into B, which was in turn based on BCPL, were just not really right for this byte-oriented machine. In particular, B and BCPL had notions of pointers, which were names of storage cells, but on a byte oriented machine in particular and also one in which the -- had 16-bit words and -- I don't think it did originally, but they were clearly intending to have 32-bit and 64-bit floating point numbers. So that there all these different sizes of objects, and B and BCPL were really only oriented toward a single size of object. From a linguistic point of view that was the biggest limitation of B; not only the fact that all objects were the same size but also that just the whole notion of pointer to object didn't fit well with .... So, more or less simultaneously, I started trying to add types to the language B, and fairly soon afterwards tried to write a compiler for it. Language changes came first. For a while it was called NB for "New B"; it was also an interpreter, and I actually started with the B compiler. ... because C was written in a language very much like itself, at every stage of the game, so, yes, it must have started with the B compiler and sort of merged it into the C compiler and added the various, the type structure. And then tried to convert that into a compiler.
The basic construction of the compiler -- of the code generator for the compiler-- was based on an idea that I'd heard about; someone at the Labs at Indian Hill. I never actually did find and read the thesis, but I had the ideas in it explained to me, and some of the code generator for NB, which became C, was based on this Ph.D. thesis. It was also the technique used in the language called EPL, which was used for switching systems and ESS machines; it stood for ESS Programming Language. So that the first phase of C was really these two phases in short succession of, first, some language changes from B, really, adding the type structure without too much change in the syntax and doing the compiler.
[The] second phase was slower. It all took place with a very few years, but it was a bit slower, or so it seemed. It stemmed from the first attempt to rewrite Unix in C. Ken started trying it in the summer of probably 1972 and gave up. And it may be because he got tired of it, or what not. But there were sort of two things that went wrong. And one was his problem, in that he couldn't figure out how to run the basic coroutine, multiprogramming primitives --how to switch control from one process to another, the relationship inside the kernel of different processes. The second thing that he couldn't easily handle was, from my point of view, the more important, and that was that the difficulty of getting the proper data structure. The original version of C did not have structures. So to make tables of objects --process tables and file tables, and that tables and this tables-- was really fairly painful. One of the techniques that we borrowed from ... and BCPL was to define names who were actually small constants and then use these essentially as subscripts: use pointers -- basically you would use a pointer offset by a small constant that was named; the name was really the equivalent of the name of a field of a structure. It was clumsy; I guess people still do the same sort of thing in Fortran.
The combination of the things caused Ken to give up that summer. Over the year, I added structures and probably made the compiler somewhat better -- better code-- and so over the next summer, that was when we made the concerted effort and actually did redo the whole operating system in C.
Mahoney: By the end of the summer of '73, Unix was an operating system, written in its own high-level language, C. But it had not yet become a philosophy of computing. That came with the program called pipe and the idea of software tools that it solidified. Doug McIlroy talks about the tortuous path to pipes:
Mahoney: But I do want to talk about pipes, because Ritchie says in his retrospective not only that it was at your suggestion but indeed, he suggests, at your insistence ...
McIlroy: It was one of the only places where I very nearly exerted managerial control over Unix, was pushing for those things, yes.
Mahoney: Why pipes? Where did the idea come from?
McIlroy: Why pipes? It goes way back. In the early '60s Conway wrote an article about coroutines --'63 perhaps, in the CACM. I had been doing macros, starting back in '59, '60. And if you think about macros, they mainly involve switching data streams. I mean, you're taking input and you suddenly come to a macro call, and that says, "Stop taking input from here. Go take it from the definition", and in the middle of the definition you'll find another macro call. So macros even as early as '64 --somewhere I talked of a macro processor as a "switchyard for data streams". Also in '64, there's a paper that's hanging on Brian's wall still, [which] he dredged out somewhere, where I talked about screwing together streams like garden hoses. So this idea had been banging around in my head for a long time.
At the same time that Thompson and Ritchie were on their blackboard, sketching out a file system, I was sketching out how to do data processing on this blackboard by connecting together cascades of processes and looking for a kind of prefix notation language for connecting processes together, and failing because it's very easy to say "cat into grep into ...", or "who into cat into grep", and so on; it's very easy to say that, and it was clear from the start that that was something you'd like to say. But there are all these side parameters that these commands have; they don't just have input and output arguments, but they have the options, and syntactically it was not clear how to stick the options into this chain of things written in prefix notation, cat of grep of who [i.e. cat(grep(who ...))]. Syntactic blinders: didn't see how to do it. So I had these very pretty programs written on the blackboard in a language that wasn't strong enough to cope with reality. So we didn't actually do it.
And over a period from 1970 to 1972, I'd from time to time say, "How about making something like this?", and I'd put up another proposal, another proposal, another proposal. And one day I came up with a syntax for the shell that went along with the piping, and Ken said, "I'm going to do it!" He was tired of hearing all this stuff, and that was ---you've read about it several times, I'm sure-- that was absolutely a fabulous day the next day. He said, "I'm going to do it." He didn't do exactly what I had proposed for the pipe system call; he invented a slightly better one that finally got changed once more to what we have today. He did use my clumsy syntax.
He put pipes into Unix, he put this notation [Here McIlroy pointed to the board, where he had written f >g> c] into shell, all in one night. The next morning, we had this -- people came in, and we had -- oh, and he also changed a lot of -- most of the programs up to that time couldn't take standard input, because there wasn't the real need. So they all had file arguments; grep had a file argument, and cat had a file argument, and Thompson saw that that wasn't going to fit with this scheme of things and he went in and changed all those programs in the same night. I don't know how ... And the next morning we had this orgy of one-liners.
Mahoney: To me one of the lovely features of pipes is the way it reinforces the notion of toolbox ...
McIlroy: Not only reinforces, but almost created it.
Mahoney: Well, that was my question. Was the notion of toolbox there before pipes ...?
Mahoney: or did pipes create it?
McIlroy: Pipes created it.
Mahoney: Unix looked different after pipes?
McIlroy: Yes, the philosophy that everybody started putting forth, "This is the Unix philosophy. Write programs that do one thing and do it well. Write programs to work together. Write programs that handle text streams, because that is a universal interface." All of those ideas, which add up to the tool approach, might have been there in some unformed way prior to pipes, but they really they came in afterwards.
Mahoney: The idea of tools took written form in Brian W. Kernighan's and P.J. Plauger's Software Tools. But, as Kernighan recalls, it was pipes that first made the idea gel.
Kernighan: I think that beyond that the notion of tools, or languages, or anything like that, did not show up in my consciousness until noticeably further on, probably when Unix was actually running on the PDP-11, which would be '71, '72 -- that kind of time. And even there not really.
Mahoney: Before or after Doug did pipes? Was pipes a trigger for this notion?
Kernighan: I think it -- it probably was the capstone or whatever --I'm not sure what the right image is, but it's the thing that makes it all work, in some sense. It's not that you couldn't do those kinds of things, because I/O redirection predates pipes by a noticeable amount -- not a tremendous amount, but it definitely predates it; I mean, that's an oldish idea. And that's enough to do most of the things that you currently do with pipes; it's just not notationally anywhere near so convenient. I mean, it's sort of loosely analogous to working with Roman numerals instead of Arabic numerals. I mean, it's not that you can't do arithmetic; it's just a bitch. Much more difficult, perhaps, and therefore mentally not -- more constraining. But all of that stuff is now squashed into such a narrow interval that I don't even know when it happened. I remember that the preposterous syntax, the "> >", or whatever syntax that somebody came up with, and then all of a sudden there was the vertical bar and just everything clicked at that point. And that was the time then I could start to make up these really neat examples that would show things like doing -- you know, running who and collecting the output on a file and then word-counting the file to say how many users there were and then saying, "Look how much easier it is with the who into the wordcount", and then running who into grep, and starting to show combinations that were things that were never thought of and yet that were so easy you could just compose them at the keyboard and get 'em right every time. And that's I think when we started to think consciously about tools, because they you could compose the things together, if you had made them so that they actually worked together. And that's when people went back and consciously put into programs the idea that they read from a list of files, but if there were no files they read from the standard input so that they could be used in pipelines.
Mahoney: The tools approach, the Unix philosophy, soon began to reflect generally shared patterns of thought. Joe Condon, whose group originally owned the PDP-7 on which the first Unix system was implemented, later moved to Computing Research and started to use the system itself. He recalls how Robert Morris initiated him to the Unix way of thinking.
Condon: Anyway, Bob Morris, who was in the group -- I would come around and say, "How do you understand what these commands do?", because the manuals are, the manual pages aren't all that clear. And he says, "What do you think that it's the reasonable thing to do? Try some experiments with it and find out, Joe." And that was very -- that, I think, was a very interesting clue to at least his philosophy and some of the other people's philosophy (I think Dennis' also)-- of how a system command or how a thing should work --is that it should work in a way which is easy to understand. That it shouldn't be a complex function, which is all hidden in a bunch of rules and verbiage and what not, that there's a field of cognitive engineering.
I think that what Bob Morris was telling me --I know that that's somehow the way he felt and I think that's the way Dennis feels too-- is that the black box itself should be simple enough such that when you form the model of what's going on in the black box, that's in fact what is going on in the black box. You shouldn't write a program to try and outwit the person and to try to double guess what they're going to want to do. You should make it such that it's clear what it does.
Mahoney: Brian Kernighan speaks of how that way of thinking shaped the design of tools. Eqn is a package for typesetting mathematics.
Mahoney: What was your first major project on Unix, and when did it start to become part of your research program?
Kernighan: The first substantial thing that I can remember was eqn, which Lorinda [Cherry] and I did. And that, I would guess, was '73 or early '74; the initial development was very, very short. But you could probably date it almost exactly. It was written in C, so there had to be a working C compiler, which was presumably put in in '72 or '73. There had to be a working yacc, because you used the yacc grammar. It was done --in fact, I could find out from this-- it was done -- there was a graduate student [name deleted] who had worked on a system for doing mathematics, but had a very different notion of what it should be. It basically looked like function calls. And so, although it might have worked, he (a) didn't finish it, I think, and (b) the model probably wasn't right. He and Lorinda had worked on it, or she had been guiding him, or something like that, and I looked at it and I thought, "Gee, that's seems wrong. There's got to be a better way to say it." And then suddenly I drifted into this notion of "do it the way you say it". And I don't know where that came from, although I can speculate. I had spent a fair length of time --maybe a couple of years-- when I was a graduate student at Recording for the Blind in Princeton. I read stuff like Computing Reviews and scattered textbooks of one sort or another, and so I was used to at least speaking mathematics out loud. Conceivably, that triggered some kind of neurons, I don't know.
Mahoney: The common philosophy integrated the separate efforts of many people, each creating tools to meet his or her own needs and then making the tools available to the group. As Unix grew, the manual became a way of maintaining its integrity and coherence. Dennis Ritchie set the style initially, and then Doug McIlroy took charge of the effort; A.G. "Sandy" Fraser comments on its importance.
Fraser: He [McIlroy] had a lot to do specifically with the manual. Now you may think of that as a clerical job, but don't think of it that way. The fact that there was a manual, that he insisted on a high standard in the manual, meant that he insisted on a high standard of every one of the programs that was documented. Then they say, as they have just done, to produce the next edition of the manual that the work that went into producing the manual involved rewriting all sorts of programs in order that they should meet the same high standard. And then add to all of that, it's probably the first manual that ever had a section with bugs in it. That's a level of honesty you don't find. It wasn't that they simply documented the bugs, they were too lazy to fix them; they fixed a lot of bugs. But some of them weren't so easy to fix, or it was uncertain what one ought to do. So they documented that. I think a level of intellectual honesty that was present in that activity is rare.
Mahoney: McIlroy himself speaks of the manual in esthetic terms, noting that it, like Unix itself, was a luxury perhaps only a research environment could afford.
McIlroy: Cleaning something up so you can talk about it is really quite typical of Unix. Every time another edition of the manual would be made, there would be a flurry of activity and, when you wrote down the uglies, you'd say, 'We can't put this in print', and you'd take features out and put features in in order to make them easier to talk about. It's the virtue of being in a research center. You don't have to keep any old software running.