The Origin of Pipes

     The first edition of Thompson and Ritchie's The Unix Programmer's Manual was dated November 3, 1971; however, the idea of pipes is not mentioned until the Version 3 Unix manual, published in February 1973. Although Unix was functional without pipes, it was this concept and notation for linking several programs together that transformed Unix from a basic file-sharing system to an entirely new way of computing. The ideas that led to pipes had existed in various forms long before the concept was formally implemented. In fact, McIlroy explains that pipes sprang from the earlier use of macros:

[I]n the early sixties, Conway wrote an article about co-routines. Sixty-three, perhaps in the C[ommunications of the] ACM, I had been doing macros, starting back in ‘59 or ‘60. And if you think about macros, they mainly involve switching data streams. You're taking in your input, you suddenly come to a macro call, and that says, 'Stop taking input from here, go take it from the definition.' In the middle of the definition, you'll find another macro call. So, macros... even as early as ‘64... Somewhere I talked of a macro processor as a switchyard for data streams.

Aho recalls that McIlroy had developed the concept of pipes much further:

Doug McIlroy, though, I think is probably the author of translation...of pipes. That he had written, I think, this unpublished paper when he [was] at Oxford back in the ‘60s....You should read this paper because it's UNIX pipes. One of the interesting things about Doug is that he has had these great, seminal ideas which not everyone knows about. And whether his standards are so high that he doesn't publish them...or what? But it's remarkable....

According to Thompson, the concept of pipes developed as a result of a combination of ideas from the 940 system, CTSS, and Multics.

There were a lot of things that were talked about but weren't really done. Like treating files and devices the same, you know, having the same read calls. Typically during those days there were special calls for the terminal and then the file system itself. Those calls weren't the same. Confusing them and redirecting I/O was just not done in those days. So, that was... I think everyone sort of viewed that as a clean concept and the right thing to do, but for some reason it just wasn't done.

Ritchie is even more willing to acknowledge the contributions of earlier systems to pipes. To him, "the pipeline is merely a specific form of co-routine. Even the implementation was not unprecedented, although we didn't know it at the time; the ‘communication files' of the Dartmouth Time-Sharing System did very nearly what Unix pipes do, though they seem not to have been exploited so fully."1

McIlroy, Thompson, and Pipes

     Although the concept of pipes existed in some form long before 1972, it was McIlroy who advocated an implementation of a pipeline structure into Unix. From the beginning stages of the project, he had been seeking an improved method of dealing with input/output structures. "It was clearly a beautiful mental model, this idea that the output from one process would just feed in as input to another." McIlroy further explains:

So, this idea had been ironed on in my head for a long time....at the same time that Thompson and Ritchie were on their blackboard, sketching out their file system, I was sketching out on how to do data processing on this blackboard, by connecting together cascades of processes and looking for a kind of prefix notation language for connecting processes together...

It was largely a result of his insistence that pipes was finally implemented.

     According to Ritchie, McIlroy later explained the pipeline idea to the Unix team on a blackboard. However, this did not spark immediate enthusiasm. There were objections to the notations and the one-input, one-output command execution structure. 2 Nevertheless, McIlroy succeeded in convincing Thompson to add pipes to Unix. Thompson explains the difficulty in implementing McIlroy's ideas:

Doug had...talked to us continually about it, a notion of interconnecting computers in grids, and arrays, you know very complex, you know, and there were always problems in his proposals....I mean there's just no way to implement his ideas and we kept trying to pare him down and weed him down and get him down, you know, and get something useful and distill it. What was going on, what was needed, what was real ideas, what was the fantasy of his...and we ...there were constant discussions all through this period, and it hit just one night, it just hit, and they went in instantly, I mean they are utterly trivial.

McIlroy recalls the events a bit differently:

Over a period from 1970 till '72, I would, from time to time, say 'How about making something like this?', and I would put up another proposal, another proposal, another proposal. Then one day I came up with a syntax for the shell, that went along with the piping and Ken said, 'I'm gonna do it.' He was tired of hearing all this stuff...and that was certainly what makes it....That was absolutely a fabulous day, the next day too. 'I'm gonna do it.' He didn't do exactly what I had proposed for the pipe system call. He invented a slightly better one, that finally got changed once more to what we have today. He did use my clumsy syntax...

     Originally, pipes used the same syntax as redirection (< and >). However, this proved to be cumbersome, as several different combinations could represent the same command. Just before a presentation in London, Thompson decided to replace McIlroy's syntax with the vertical bar, eliminating the ambiguities of the old syntax. As Kernighan recalls, "I remember the preposterous syntax, that ">>" or whatever syntax, that somebody came up with, and then all of a sudden there was the vertical bar, and just [snaps fingers] everything clicked at that point." The beauty of the structure that McIlroy once described as "garden hoses" was recognized; data would simply flow from one program to another.

     In retrospect, the notation and syntax of pipes were just as important as the concept itself; pipes might not have been so successful without this further distinction from redirection. As Aho recalls, the full implications of pipes gradually developed after this:

I really didn't appreciate the significance of what you could do with it, at the time, in the ‘60s. And I don't think anyone did, because...what made a lot of this philosophy...a lot of these tools go was the framework that Unix provided. That you could have pipes on which you could take the output of one program, and transmit it as input to another program.

Kernighan explains why pipes was a superior method of input/output:

It's not that you couldn't do those kind of things, because I had already written redirection; it predates pipes by a noticeable amount. Not a tremendous amount, but it definitely predates it. That's an oldish idea. That's enough to do most of the things that you currently do with pipes; it's just not notationally anywhere near so convenient. I mean, it's sort of loosely analogous to working with Roman numerals instead of Arabic numerals. It's not that you can't do arithmetic, it's just a bitch. Much more difficult, perhaps, and therefore mentally...more constraining.

Pipes went far beyond McIlroy's original goal of creating a new I/O mechanism; the programmers used pipes to send an output of one program to the input of another. As Kernighan explains:

That was the time, then, I could start to make up these really neat examples [of pipe commands] that would show things like doing, you know, running who, and collecting the output in a file, and then word counting the file to say how many users there were, and then saying, 'Look how much easier it is with...[piping] the who into the word count, and running who into grep,' and starting to show combinations that were things that were never thought of, and yet they were so easy that you could just compose them at the keyboard and get them right every time. That's, I think, when we started to think, probably consciously, about tools, because then you could compose the things together if you had made them so that they actually worked together.

It was the pipes concept that allowed the notion of the software toolbox to develop. When interviewed by Mahoney, McIlroy insisted that pipes "not only reinforced, [but] almost created" the toolbox.

Software Tools

     The first step in developing what would come to be known as the software toolbox was to ensure that all programs could read from the standard input. McIlroy explains the problem, and its solution:

Most of the programs up until that time couldn't take standard input, because there wasn't the real need. They had file arguments. grep had a file argument, cat had a file argument. Thompson saw that that wasn't going to fit into this scheme of things, and he went in and changed all those programs in the same night. I don't know how. In the next morning we had this orgy of 'one-liners.' Everybody had one-liner. 'Look at this, look at that.'

Kernighan elaborates with an example:

And that's when people went back and consciously put into programs the idea that they read from a list of files, but if there were no files they read from the standard input, so that they could be used in pipelines. People went back and did that consciously in programs, like sort. Sort--an example of a program that cannot work in a pipeline, because all the input has to be read before any output comes out--it doesn't matter, because you're going to use it in a pipeline, right? And you don't care whether it piles up there briefly; it's going to come out the other end. It's that kind of thing, where we say, 'Hey, make them work together. Then they become tools.' Somewhere in there, with the pipes, and maybe somewhere the development of grep--which Ken did, sort of overnight--the quintessential tool, as I guess Doug refers to it. A thing which, in a different environment probably you don't see it that way. But, in the Unix environment you see it as the basic tool, in some sense.

grep was, in fact, one of the first programs that could be classified as a software tool. Thompson designed it at the request of McIlroy, as McIlroy explains:

One afternoon I asked Ken Thompson if he could lift the regular expression recognizer out of the editor and make a one-pass program to do it. He said yes. The next morning I found a note in my mail announcing a program named grep. It worked like a charm. When asked what that funny name meant, Ken said it was obvious. It stood for the editor command that it simulated, g/re/p (global regular expression print)....From that special-purpose beginning, grep soon became a household word. (Something I had to stop myself from writing in the first paragraph above shows how firmly naturalized the idea now is: 'I used ed to grep out words from the dictionary.') More than any other single program, grep focused the viewpoint that Kernighan and Plauger christened and formalized in Software Tools: make programs that do one thing and do it well, with as few preconceptions about input syntax as possible. 3

     The idea of specialized programs was carried even further with the development of eqn, a mathematical text formatter developed by Kernighan and Cherry. Kernighan explains how eqn developed:

[T]here was a graduate student named [name deleted] who had worked on a system for doing mathematics, but had a very different notion of what it should be. It basically looked like function calls. And so, although it might have worked, he a) didn't finish it, I think, and b) the model probably wasn't right. I remember, he and Lorinda had worked on it, or she had been guiding him, or something like that. I looked at and I thought, 'Gee, that seems wrong, there's got to be a better way to say it.' I mean, then suddenly I drifted into this notion of, do it the way you say it. I don't know where that came from, although I can speculate. I had spent a fair length of time, maybe a couple of years, when I was a graduate student, at Recording for the Blind, at Princeton. I read stuff like computing reviews and scattered textbooks of one sort or another, so I was used to at least speaking mathematics out loud. Conceivably, that trigged some kind of neurons. I don't know.

eqn was an important software tool because, according to Kernighan, it 'was the first--something that sat on top of, or in front of, a formatter to genuinely broaden what you could do with them.' eqn went a step beyond grep; not only was it a small program that served one function, but it served little purpose without being tied to another program through a pipeline.

     eqn and grep are illustrative of the Unix toolbox philosophy that McIlroy phrases as, "Write programs that do one thing and do it well. Write programs to work together. Write programs that handle text streams, because that is a universal interface." This philosophy was enshrined in Kernighan and Plauger's 1976 book, Software Tools, and reiterated in the "Foreword" to the issue of The Bell Systems Technical Journal that also introduced pipes. 4 By the time these were published in the late 1970s, software tools were such an integral part of Unix that one could hardly imagine the operating system without them. As Kernighan explains:

People would come in and they'd say, 'Yeah, this is nice, but does the system do X?' for some X, and the standard answer for all of this was, 'No, but it's easy to make it do it.' Unix has, I think for many years, had a reputation as being difficult to learn and incomplete. Difficult to learn means that the set of shared conventions, and things that are assumed about the way it works, and the basic mechanisms, are just different from what they are in other systems. Incomplete means, because it was meant as a program development environment, it doesn't have all the finished products necessarily. But, as a program development environment, it's very easy to build a lot of these things. It's sort of like a kit. And if you want a new thing, you can take the pieces out of the kit and assemble them to make your new thing, rather more rapidly than you would be able to do the same thing in some other kind of environment. So, we used to say that. 'Does it do X?' 'No, but it's real easy. Do you want one by tomorrow? I'll give you one by tomorrow.'

As the software tools concept solidified, there was an increased interest among Unix programmers in developing a wider variety of specialized tools, and in developing them more quickly.

Little Languages

     The idea of tools was extended to include the idea that people would actually use tools to develop other tools. Thus, the Unix programmers developed highly specialized scripting tools that became known as "little languages." Kernighan explains how the concept of little languages developed for him:

Somewhere, somebody asked me to give a talk. I looked back and realized that there was, in some way, a unifying theme to a lot of the ways that I had been fooling around over the years, which is that I had been building languages to make it easy to attack this, that, or the other problem. In some way, make it easy for somebody to talk to the machine. I started to count them up, and, gee, there were a lot of things there that were languages. Some of them were absolutely conventional things, some of them were pre-processors that sat on other things, some were not much more than collections of subroutines; but, you know, you could sort of call them languages. And they were all characterized by being relatively small, as they were things that were done by one or two people, typically. And they were all not mainstream; I never built a C compiler. They were attacking sort of off-the-wall targets. So, I said, gee, well, they're little languages.

According to the broad definition given by Kernighan, software tools such as eqn, tbl, and make can be considered little languages. Scripting languages are also little languages because they simplify tasks that would otherwise become complex under a full-scale language such as C.

     One such scripting language, awk, was developed by Aho, Peter Weinberger, and Kernighan to be used for "simple one or two-line programs to do some filtering as part of a larger pipeline."5 The modern language perl, a prominent scripting language used on the World Wide Web, is a descendant of awk. Kernighan explains the origins of awk:

We had this thing called qed....It was a programmable editor, but it was programmable in some formal sense. It was just awful, and yet it was the only thing around that let you manipulate text in a program without writing a hell of a lot of awkward code. So I was interested in programmable editors, things that would let you manipulate text with somewhat the same ease that you can manipulate numbers. I think that that was part of my interest in awk. The other thing is--that I remember as a trigger for me--was a very, very specialized tool that a guy named Mark Rochkind developed....He had a program that would let you specify basically a sequence of regular expression and message...and then it would create a program such that, if you pass data through this program, when it's an instance of the regular expression, it would print the message. And we'd use it for data validation. And I thought, what a neat idea! It is a neat idea. It's a really elegant idea. It's a program that creates a program that then goes off and validates data, and you don't have to put all the baggage in. Some program creates the baggage for you. The only problem with it was that it was specialized, this one tiny application. And so my contribution to awk, if you like, is the notion that you can generalize this.

Weinberger explains that one of the main purposes of awk was to improve the database capabilities of Unix:

So we sat around and talked about this stuff and there's roughly speaking two pieces to databases. One is the question of how you get stuff out of the database. And the other is the question of how you sort of put stuff into the database. And putting stuff into a database gets involved in these ‘are we going to allow for concurrent transactions?' and ‘do we have to do locking?' because Unix was not particularly good...was incapable of in those days. And it was just all too weird. Eventually we settled on the idea of what we wanted was some...tool that would let you get stuff out of ordinary Unix files in a way that was...more general, more useful, more database-like, more report generally like.

awk used a regular expression matching function similar to grep, but greatly expanded on it by adding the ability to replace the original expressions with the desired data stream.

The Attic

     It might have been more difficult for the Unix programmers to develop software tools as quickly as they did had they been working in a different environment. The center of Unix activity was a sixth-floor room at Murray Hill which contained the PDP-11 that ran Unix. "Don't think of a fancy laboratory, but it was a room up in the attic," as Morris describes it. In addition to the programmers, four secretaries from the patent department worked in the attic, performing the text-processing tasks for which Unix was ostensibly developed. Morris describes the environment:

We all worked in the same room. We worked all up in an attic room on the sixth floor, in Murray Hill. In space that maybe was one and a half times the size of this hotel room. We were sitting at adjacent terminals, and adjacent, and we knew each other and we always in fact ate lunch together. Shared a coffeepot. So, it was a very close relationship and most of us were both users and contributors and there was a significant initiative for research contribution at all points.

The unique working conditions of the programmers led to a free exchange of ideas and complete access to information. Moreover, the close-knit environment led to certain standards of etiquette among the programmers. Cherry gives an example:

[T]here was this attitude that he who touched it last owned it. So if you needed pr to do something pr didn't do, and you went and added it, you now owned pr. And so if some other part of it broke, you owned it.

Despite the additional responsibilities that resulted from changing a tool, the programmers did not hesitate to make any improvements they deemed necessary. For example, Morris modified pr, as he explains:

I remember, for example, one piece of software that I made a noticeable change in. I was listening one day in about 1974 to Ken and Dennis arguing about when something happened, and even at that point they couldn't agree to the nearest year it happened. They had a printout in front of them which had the date on it--month and day of the month. And I looked at them, looked at the piece of paper, their argument, and in my best Southern drawl I said, 'Ah shit,' and turn around [to the] console and actually changed the print programming program called pr and so it would now print out the year.

Morris greatly appreciated the fact that tools were easy to use and fix, even fixing problems before they were vocalized.

One day early on--let me pick about 1973--I was watching Dennis Ritchie do some arithmetic computations. I don't mean anything fancy. He was just adding up a list of numbers, using dc to do it, and as he was typing them in he made a error of typing, and dc for no particular reason except just the way it was designed--it could have just printed him an error comment, but that's not what it did--it printed him an error comment and wiped out the current sum. So, he had to start from scratch, and [I] again went back to my favorite Southern drawl. Went in made the change to the first program of dc. Recompiled it and installed it and it when about ten minutes later. Dennis, who probably hadn't seen me, the fact that I was watching him, said, 'There's a problem with your program and think I ought to fix it.' 'Hey, it's already installed.' And that kind of thing could happen with any person, any software, any time and was the rule rather than the exception.

This example illustrates how the open, cooperative environment in the attic improved the responsiveness and flexibility of the system.

     More importantly, each programmer used all of the tools himself, and thus could correct and enhance them in ways that would not have been possible in a different environment. As Morris explains, no person involved in the project could be a user without being a programmer.

I was a user and I was creating [a] system that in part, in large parts I wanted to use. So, the parts I was creating were in many cases the part I needed for my own work. So, I was both a user and a contributor. But, that was generally true. It was true of everyone.

He specifically discusses his calculator program, dc:

Though I didn't write [dc] for the public, I wrote it for myself and that’s true of a lot of software that people are by and large writing software according to their own standards. The way they wanted to. For their own use, and the use of their friends and associates.

Similarly, Kernighan believes that no one could be a programmer in Unix without being a user of the system.

We use our own stuff, and I think that's a critical observation about this group here. We do not build tools for other people. We do not build anything for other people. I think it's not possible to build things for other people, roughly speaking.

He continues:

If I build something for you, even if you spend a lot of time describing to me what you want, and why it's the way it is, it's not going to be as successful as something where I personally face the problems. Now, I may live with you long enough that I start to understand what your problems are, and then I'll probably do a better job, but I think that we have historically done the best on building things that address problems that we face ourselves. That we understand them so well because we face them, either directly--you know, I face that problem myself--or it's the person in the next office.

This environment fostered not only the toolbox idea, but an entire philosophy of programming.