Software as Science - Science as Software

Michael S. Mahoney
Princeton University

Preprint version of article published in Ulf Hashagen, Reinhard Keil-Slawik, and Arthur Norberg (eds.) History of Computing: Software Issues (Berlin: Springer Verlag, 2002)

©2000 Michael S. Mahoney


  I don't really understand the title, Computer Science. I guess I don't understand science very well; I'm an engineer. ... Computers are worth thinking about and talking about and doing about only because they are useful devices, which do something for somebody. If you are just interested in contemplating the abstract, I would strongly recommend the belly button, which would survive any war that man survives.
John R. Pierce[1]

1. Keynote Address, Conference on Academic and Related Research Programs in Computing Science, 5-8 June 1967; publ. in University Education in Computing Science, ed. Aaron Finerman (New York: Academic Press, 1968), 7. Renowned for his work in information theory, Pierce at the time was Executive Director of Research, Communications Sciences Division, Bell Telephone Laboratories.

Defining the subject historically

Software should be of great interest to historians of science. That may seem strange, given that it is of such recent origin. Software is no older than the modern electronic computer and the activity of writing programs for it. It is still experiencing growing pains. Yet, over the past fifty years, it has become the subject of its own thriving science and a ubiquitous medium for pursuing other sciences. In both instances software represents a new kind of science. It is what Herbert Simon calls a "science of the artificial".[2]
2. Herbert Simon, The Sciences of the Artificial (Cambridge, MA: MIT Press, 1969; 2nd ed. 1981, 3rd ed. 1996).
There is nothing natural about software or any science of software. Programs exist only because we write them, we write them only because we have built computers on which to run them, and the programs we write ultimately reflect the structures of those computers. Computers are artifacts, programs are artifacts, and models of the world created by programs are artifacts. Hence, any science about any of these must be a science of a world of our own making rather than of a world presented to us by nature.[3] What makes it both challenging and intriguing is that those two worlds meet in the physical computer, which enacts a program in the world. Their encounter has posed new and difficult epistemological questions concerning what we can know both about the workings of the models and about the relation of the models to the phenomena they purport to represent or simulate. Answers to those questions would seem to depend, at least in part, on understanding programs as dynamic systems. 3. Let me leave aside for the moment questions about how much "nature" ever presents itself to us directly.

Because software as science is both new and artificial, it brings to the fore questions of when and how and who. It took some time before programs and programming became subjects of inquiry in themselves. Once they did, it was not clear what one wanted to know about the programs or the activity of writing them, or indeed could know about these subjects. Debate seems to have been particularly lively in the late 1960s. Most practitioners viewed the subject as inherently mathematical. Yet, Marvin Minsky decried excessive formalism, pointing to the "defeatism" of theorems about the limits of computability and to the inadequacy of formal systems to provide an explanatory account of what computers could actually do.[4] Allan Newell and Herbert Simon insisted even more strongly on treating computer science as an empirical discipline. It is the study of the computer as a dynamic physical device and what programmers are capable, intentionally or not, of making it do.[5] Donald Knuth insisted on the craft nature of programming, characterizing it as an "art". In what began as a "light-hearted attempt to stir up some controversy regarding the nature of computer science", Peter Wegner tried in "Three Computer Cultures" to differentiate the concerns of the computer science from those of both the mathematician and the engineer.[6] While George Forsythe offered counsel on "What to Do until the Computer Scientist Comes", John Pierce ended the keynote address cited above with the hope that "computer scientists, whatever they are, get organized effectively, and I wish them good luck."[7]

4. Marvin Minsky, "Form and Content in Computer Science" (1969 Turing Award), ACM Turing Award Lectures: The First Twenty Years, 1966-1985 (New York: ACM Press, 1987), 219-242.

5. Allan Newell and Herbert Simon, "Computer Science as Empirical Inquiry: Symbols and Search" (1975 Turing Award), ACM Turing Award Lectures, 287-313. Newell and Simon had earlier joined with Alan Perlis in taking a similar position in a "What is Computer Science?", a Letter to the Editor of Science 157(22 Sept. 67), 1373-4.

6. Peter Wegner, "Three Computer Cultures: Computer Technology, Computer Mathematics, and Computer Science", Advances in Computers 10(1970), 7-78.

7. Forsythe, in American Mathematical Monthly 75,5(1968), 454-62; Pierce, in Finerman, 24.

So the history with which we are concerned begins with the question of who created the science(s) of software, when, where, why and how? That is, who thought it necessary or desirable to place programs and programming on some sort of scientific foundation? What was such a foundation meant to accomplish? To what questions would it provide answers? What theoretical and practical benefits did it promise? What sort of science did its creators envision? That is, what established scientific disciplines did they take as models and resources for their new enterprise? How did these aspirations shape the science(s) that emerged, and how did the development of the science(s) reshape the aspirations? As the science(s) developed, how did it (or they) interact with other sciences, especially those that looked to the computer as a tool and then as a medium of investigation? What did those sciences contribute to the science of software and what in turn did they take from it? What about other disciplines not generally considered scientific?

These questions clearly intersect with those of the other areas on our program. The mathematical verification of programs as a warrant of reliability lies at the root of formal semantics. Moreover, if one views engineering as applied science, then one faces the question of what science it is that software engineering applies. Conversely, one may ask what role the theory that has emerged has played in the practice of programming, especially programming in the large, and how that role fits with the status accorded to theory (and to those who pursue it) in the profession at large. The answers to those questions impinge in evident ways on the nature and organization of software as a labor process.

The science mainly in question here is mathematics, the relation of which to computing has evolved dynamically over the past half-century. As a physical device, the modern computer was built for mathematicians to carry out numerical calculations, especially for problems which could not be solved analytically. As a theoretical concept, the modern computer was designed by mathematical logicians to understand the nature of computability and the limits of what can be known by it. As a dynamic computational system, the modern computer has posed new mathematical problems and opened new fields of mathematical research. At the same time, the computer has proved elusive, as central concerns of programming remain beyond the effective reach of mathematics and thus again raise the question of what software as science has to do with software as engineering or as reliable artifact.

No differently from any other science, software as science involves more than a body of knowledge and practice. It means communities of practitioners recognized as possessing that knowledge and charged with extending and disseminating it. The science in question is what they know and do in common. Taking this approach allows for the science(s) of programs and programming to take different forms among different groups of practitioners. That is, it allows for different answers at different times in different places to the question "what is software and what may be said scientifically about it?", or indeed, "what needs to be said scientifically at all about software?" To the extent that a consensus has emerged, it requires an explanation, and historians of science have found that the explanation is likely to be as much social as intellectual.

Over the past fifty years, computer scientists have grown from a handful of people to an extensive network of practitioners in industry, academia, and private practice. They occupy positions of prominence in colleges and universities; indeed, together with molecular biology (with which they have intellectual ties), they constitute the fastest growing sector of academia. Generously funded by industry and government, they have professional associations (ACM, IEEE Computer Society, BCS, etc.), journals, monographs, textbooks, and an elaborate reward structure.[8] Much of this growth rests on a claim to be pursuing a scientific enterprise, even as practitioners have debated among themselves just how scientific it is or should be. How practitioners achieved recognition of that claim is an integral part of the history of software as science.

Agendas

8. The Turing Award is considered the ACM's highest honor. "It is given to an individual selected for contributions of a technical nature made to the computing community. The contributions should be of lasting and major technical importance to the computer field." (http://www.acm.org/awards/taward.html). A look at the list shows that "technical" has usually (but not always) been construed as "theoretical", indeed "mathematica

Elsewhere I have suggested that the practice of a discipline can be fruitfully approached through the notion of "agenda".[9] At the risk of repeating myself, it might be worth recapitulating what I mean by that. The agenda of a field consists of what its practitioners agree ought to be done, a consensus concerning the problems of the field, their order of importance or priority, the means of solving them, and perhaps most importantly, what constitutes a solution. Becoming a recognized practitioner means learning the agenda and then helping to carry it out. Knowing what questions to ask is the mark of a full-fledged practitioner, as is the capacity to distinguish between trivial and profound problems; "profound" means moving the agenda forward. One acquires standing in the field by solving the problems with high priority, and especially by doing so in a way that extends or reshapes the agenda, or by posing profound problems. The standing of the field may be measured by its capacity to set its own agenda. New disciplines emerge by acquiring that autonomy. Conflicts within a discipline often come down to disagreements over the agenda: what are the really important problems?

9. Michael S. Mahoney, "Computer Science: The Search for a Mathematical Theory", in John Krige and Dominique Pestre (eds.), Science in the 20th Century (Amsterdam: Harwood Academic Publishers, 1997), Chap. 31.

As the shared Latin root indicates, agendas are about action: what is to be done? By emphasizing action, the notion of agendas refocuses attention from a body of knowledge to a complex of practices. Since what practitioners do is all but indistinguishable from the way they go about doing it, it follows that the tools and techniques of a field embody its agenda. When those tools are employed outside the field, either by a practitioner or by an outsider borrowing them, they bring the agenda of the field with them. Using those tools to address another agenda means reshaping the latter to fit the tools, even if it may also lead to a redesign of the tools, with resulting feedback when the tool is brought home. What gets reshaped and to what extent depends on the relative strengths of the agendas of borrower and borrowed.

Theoretical Computer Science

That tools embody agendas has particular importance for new sciences. For, a new science means a new agenda, and tracing the emergence of a new science means showing how a group of practitioners coalesced around a common agenda different from other agendas in which they had been engaged. What questions or problems drew them to the computer? What tools did they bring with them and how did they apply those tools? How did their involvement shape the emerging agenda of the new field?

That brings me to what is generally considered the scientific basis of software, namely, theoretical computer science.[10] It took shape between 1955 and 1975 as practitioners from a variety of fields converged on a small set of related agendas that came to constitute the core of the field: automata and formal languages, computational complexity, and formal semantics. None of those agendas had existed before 1955. By the early 1970s their status as constituents of an autonomous discipline was marked by a main heading in Mathematical Reviews, by a growing number of dedicated textbooks, and by the establishment of curricula at both the undergraduate and graduate levels. Perhaps even more strikingly, by the mid-70s theoretical computer science had begun gain recognition as a field of mathematics in its own right and to serve as resource for other sciences, most notably theoretical biology.

10. It is curious that to this day the community distinguishes between computer science and theoretical computer science, as if the former involves some kind of science other than theoretical science. It is not clear what that other kind of science might be nor what is scientific about it.

My charge is not to provide a history of that development but to suggest what such a history might look like and how it might be most productively pursued, in short, to offer an agenda for history of software viewed as science So let me restrict my account to the following diagrams (Figs. 1 and 2), which encapsulate my own efforts to trace the emergence of the agendas of theoretical computer sciences from the intersection and interaction of a variety of agendas in fields ranging from electrical engineering to linguistics.[11] The schemes suggest a number of lines of fruitful inquiry.

11. Michael S. Mahoney, "Computer Science"; see also "The Structures of Computation", in Raul Rojas and Ulf Hashagen (eds.), The First Computers – Histories and Architectures (Cambridge, MA: MIT Press, 2000).

To begin with, it seems clear that theoretical computer science can be viewed from a number of disciplinary perspectives. Indeed, its formation can be understood only from those perspectives. Computing had no science of its own at the start. Mathematical logic had established what computers could not do, even with endless resources of time and space. Switching theory showed how to analyze and synthesize circuits for basic operations. But no science accounted for what finite machines with finite, random access memories could do or how they did it. That science had to be created, and its creation depended heavily on what was going on in other fields at the time, most notably linguistics. Before the science of computing began to accumulate a history of its own, it was heir to several different histories. Understanding its subsequent development may well involve keeping those histories in mind and looking for their continuing influence.

At each point of convergence the nascent field acquired a set of tools from an antecedent discipline. One may ask what those tools were originally developed to accomplish, how their application to computing contributed to shaping the new subject, to what extent the application in turn reshaped the tools or redefined their status in the parent discipline. An example is the new mathematical interest acquired by finite Boolean algebras as a result of their application to questions of the minimization and optimization of sequential circuits.[12]

12. For this and other examples of feedback from computer science to mathematics, see Garrett Birkhoff, "The Role of Modern Algebra in Computing", Computers in Algebra in Number Theory (American Mathematical Society, 1971), 1-47. For a discussion of the changes in the mathematics curriculum prompted by computer science, see Anthony Ralston, "Computer Science, Mathematics, and the Undergraduate Curriculum in Both", American Mathematical Monthly 81,7(1981), 472-85.

Indeed, one of the uncanny aspects of the development of theoretical computer science has been the way it has given practical meaning to the most abstract mathematical structures: semigroups, lattices, categories. None of these was created with computers in mind, and in each case it is not hard to find statements by mathematicians of the time insisting on their uselessness even to mathematics. Each is fundamental to modern computer science, which has arguably created the notion of "applied algebra", even to the point that one recent book offers "category theory for the working computer scientist."[13] Though not originally a mathematical construct, the lambda calculus has similarly moved from theoretical structure to practical tool (especially once Scott provided a mathematical model in continuous lattices) and indeed recently has begun to move out from computing per se into the area of theoretical biology.[14]

13. For example, Garrett Birkhoff and Thomas C. Bartee, Modern Applied Algebra (New York: McGraw-Hill Book Company, 1970); Rudolf Lidl and Gunter Pilz, Applied Abstract Algebra (New York: Springer Verlag, 1984); Andrea Asperti, Categories, Types, and Structures: An Introduction to Category Theory for the Working Computer Scientist (Cambridge, Mass. : MIT Press, 1991).

14. See W. Fontana and Leo W. Buss, "The barrier of objects: From dynamical systems to bounded organizations", in J. Casti and A. Karlqvist (eds.), Boundaries and Barriers (Reading, MA: Addison-Wesley, 1996), 56-116.

The diagrams indicate, however sketchily, that various parts of the agenda took shape initially in different places. For example, the identification of formal power series, pushdown automata, and context-free languages brought brought together at MIT agendas ranging from the algebraic coding theory of Marcel Schützenberger in Paris to the work on sequential formula translation of Fritz Bauer and Klaus Samelson in Munich, which in turn drew on Heinz Rutishauser's early efforts at automatic programming. To take another example, the notion of using the lambda calculus as the basis for formal semantics seems clearly to have originated with John McCarthy, who needed it as a means of abstracting functions for his work on mechanical theorem-proving and on commonsense reasoning by computers. Yet, it seems equally clear from activities surrounding the Algol meetings that others besides McCarthy were familiar with the lambda calculus and were exploring its use as a vehicle for defining the semantics of the new language. Indeed, the lambda calculus and formal semantics quite quickly crossed the Atlantic in the early '60s, settling in primarily with Peter Landin and Christopher Strachey at Cambridge but then spreading to Vienna and Amsterdam, where Dana Scott's seminal collaboration with Jaco de Bakker took place. While McCarthy's work spoke to an agenda already underway elsewhere, transcripts of a Working Conference on Mechanical Language Structures held in Princeton in 1963 suggest that it received a cooler reception closer to home, where more pragmatic concerns dominated.[15]

These are just two examples, I suspect, of how agendas at first reflect local interests and ways of doing things. Different groups of people constitute different mixes of scientific training, taste, and aspirations, reflecting in many cases their differing cultural and institutional backgrounds.[16] In addition to questions of how a local group coalesces around a common project, there is the larger issue of how that project then moves onto the agenda of the profession as a whole.[17] Of great interest among historians of science over the past decade has been a question of how practices travel. Studies have revealed the particular importance of individuals moving from one place to another, learning and conveying by collaboration and example results and techniques that have not yet reached print.

15. The proceedings of the Conference were published in CACM 7,2(1964), 51-36; see in particular the "Summary Remarks" by Saul Gorn and the "General Discussion" that followed, pp. 133-6.

16. Such differences shine through the protocols of the Software Engineering Conferences at Garmisch and Rome. People differed about what it would mean to make the subject scientific, about the extent to which one can do so, about the importance of trying to make it so, about the means for achieving that goal. They had different agendas.

17. Not all agendas have converged on the current configuration. For example, Ershov and other Russian computer scientists took their own approach to a science of software but did so in relative isolation from research in the West. To the historian, this independent line of development offers an opportunity for comparisons and contrasts, and holds out the possibility of linking agendas to the political, social, and economic context within which they take shape. See, for example, Andrei P. Ershov, Origins of Programming: Discourses on Methodology (New York: Springer Verlag, 1990); Ershov and M.R. Shura-Bura, "The early development of programming in the USSR", in N. Metropolis et al. (eds.) A History of Computing in the Twentieth Century (New York: Academic Press, 1978), 137-96; R.A. Di Paola, "A Survey of Soviet Work in the Theory of Computer Programming" (Rand Memorandum RM-5424-PR; Santa Monica: Rand Corporation, 1967).

Research and Training

One measure of the importance of an agenda are the resources allocated to it by the community of practitioners, usually acting as agents for the government or industry. Norberg's and O'Neill's study of DARPA's IPTO and the as yet unpublished study of National Science Foundation's Office of Computing Activity by Aspray, Williams, and Goldstein offer glimpses into the interactive process by which government agencies and the research community shape the agenda of the discipline.[18] As in so many other instances, Dick Hamming offers historians a valuable perspective on what was at stake. In his Turing Lecture of 1968, at a time when the nature of the field seemed uncertain, he warned his audience:
In the face of this difficulty [of defining "computer science"] many people, including myself at times, feel that we should ignore the discussion and get on with doing it. But as George Forsythe points out so well[19] in a recent article, it does matter what people in Washington D.C. think computer science is. According to him, they tend to feel that it is a part of applied mathematics and therefore turn to the mathematicians for advice in the granting of funds. And it is not greatly different elsewhere; in both industry and the universities you can often still see traces of where computing first started, whether in electrical engineering, physics, mathematics, or even business. Evidently the picture which people have of a subject can significantly affect its subsequent development. Therefore, although we cannot hope to settle the question definitively, we need frequently to examine and to air our views on what our subject is and should become.[20]
Research funding is a matter of more than just money. Until a field gains autonomy over its own agenda, its development depends on what other disciplines think its practitioners should be doing.
18. Arthur L. Norberg and Judy E. O'Neill, Transforming Computer Technology: Information Processing for the Pentagon, 1962-1986 (Baltimore: Johns Hopkins University Press, 1996). William Aspray, Bernard O. Williams, and Andrew H. Goldstein, Computing as Servant and Science: The Impact of the National Science Foundation, unpub. draft, 1992; cf Aspray and Williams, "Arming American scientists: NSF and the provision of scientific computing facilities for universities, 1950-1973", Annals of the History of Computing 16, 4(1994):, 60-74.

19. [Hamming's note]Forsythe, G.E. What to do until the computer scientist comes. Am. Math. Monthly 75,5(May 1968), 454-461.

20. Richard W. Hamming, "One Man’s View of Computer Science", ACM Turing Award Lectures, 207-18; at 208.

Proposals and requests for proposals (RFPs) provide valuable insight into the articulation of agendas. They aim at enlisting support and hence must tie the proposers' aims to those of their reviewers and their reviewers' institutions. Moreover, they capture the proposers' thinking before the work has been carried out and thus offer a chance of comparing shifts in direction as questions are answered, sometimes in unanticipated ways. It is revealing, for example, to see how Minsky, McCarthy, Shannon, and Rochester viewed the agenda they called "artificial intelligence" in proposing their famous summer workshop in 1956.[21] McCarthy has subsequently insisted on the value of such documents in establishing the aims and methods of scientific research.

21. at http://www-formal.stanford.edu/jmc/history/dartmouth.html.

Viewing a science in terms of its evolving agenda means, among other things, seeing how the agenda is communicated to the next generation of practitioners. In explaining what I mean by "agenda", I said that one becomes a practitioner by learning what is to be done, i.e. by learning what the questions or problems of the field are, how they are tackled and resolved, and what constitutes a solution. Kuhn's notion of "paradigm" fits well here, especially as he subsequently clarified it through the concept of a "disciplinary matrix".[22] Science is conveyed by examplars, by models of problem-solving. Students start with what is best established and most familiar to practitioners, and then move from there onto rougher terrain until they come to the edges of known territory. In the sciences in particular, that does not mean that students must recapitulate the entire history of the discipline. On the contrary, what makes certain solutions paradigmatic is precisely the way in which they encompass and give structure to a range of problems, transforming their hard-won solutions into corollaries.

22. Thomas S. Kuhn, The Structure of Scientific Revolutions (Chicago: University of Chicago Press, 1962; 2nd ed. 1970.

That is what makes textbooks and curricula an important resource for tracing the emergence and development of a discipline. They reflect its agenda not at the frontiers of research but at the starting point for reaching those frontiers. They are statements about what current practitioners at a particular time think students must know to become the next generation of practitioners. Hammering out a curriculum can be a harrowing experience for participants precisely because it means reaching agreement on what the subject is about: what is central and what peripheral, what must everyone know and what can be an elective, in what order are these things to be learned? To the historian, the process of hammering out is as important as, or perhaps even more important than the end result. We have the published versions of a succession of ACM curricula in computer science and responses to them.[23] I hope we also have the minutes of the meetings of the committees that wrote them, not to say copies of the exchanges that went on between meetings.

23. For example, ACM Curriculum Committee on Computer Science, "An Undergraduate Program in Computer Science – Preliminary Recommendations", CACM 8(1965), 543-8; "Curriculum 68 – Recommendations for Academic Programs in Computer Science", CACM 11(1968), 151-97; "Curriculum 78 – Recommendations for the Undergraduate Program in Computer Science", CACM 22(1979), 147-66; A. Ralston and M. Shaw, "Curriculum 78 – Is Computer Science Really That Unmathematical?", CACM 23(1980), 67-70.

What the profession as a whole was trying to accomplish was happening in colleges and universities, as computer scientists sought to define a place for themselves in their institutions and to justify their recognition as distinct academic units on a par with those already established. The volume on University Education in Computing Science edited by Aaron Finerman in 1968 provides a good survey of the range of thinking on the matter at that crucial time. The local strategies of practitioners, in particular the alliances they forged with other disciplines, should also prove revealing. Anniversaries and retirements of founders have provided largely celebratory accounts for departments at Purdue, Cornell, MIT, and elsewhere, but no one has yet undertaken a critical, documented analysis of how the new science of computing established itself at a university.[24]

24. For Purdue, see John Rice and Richard A. DeMillo (eds), Studies in Computer Science in Honor of Samuel D. Conte (New York and London: Plenum Press, 1994), for Cornell, David Gries, "Twenty Years of Computer Science at Cornell", Engineering: Cornell Quarterly 20,2(1985), 2-11.

Mathematics and Software Engineering

In addition to being new and artificial, software as science should hold interest for historians of science in the perspective it affords on the question of the relation of theory to practice. In Science, the Endless Frontier, the document that determined American post-war science policy, Vannevar Bush took as axiomatic the proposition that technology emerges from basic science.[25] It was a widely held view, which we see reflected in John McCarthy's vision of a mathematical theory of computation, expressed at IFIP 1962:

In a mathematical science, it is possible to deduce from the basic assumptions, the important properties of the entities treated by the science. Thus, from Newton's law of gravitation and his laws of motion, one can deduce that the planetary orbits obey Kepler's laws.[26]
As McCarthy and his audience well knew, one can also deduce the laws of the motion of terrestrial bodies and all the mechanics that derives from them. He extended the analogy at the conclusion of his 1963 article, "A Basis for a Mathematical Theory of Computation", by reference to later successes in mathematical physics:
It is reasonable to hope that the relationship between computation and mathematical logic will be as fruitful in the next century as that between analysis and physics in the last. The development of this relationship demands a concern for both applications and mathematical elegance.[27]
The applications of mathematics to physics had produced more than new theories. The mathematical theories of thermodynamics and electricity and magnetism had informed the development of heat engines, of dynamos and motors, of telegraphy and radio. Those theories formed the scientific basis of engineering in those fields. McCarthy expected that the theory of computation would do the same for programming, to the point that "no one would pay money for a computer program until it had been proved to meet its specifications."[28]

In light of that original program, it is all the more striking to hear the lament of Christopher Strachey shortly before he and Dana Scott realized McCarthy's theoretical goal. In a discussion on the last day of the second NATO Conference on Software Engineering held in Rome in October 1969, Strachey observed that "one of the difficulties about computing science at the moment is that it can't demonstrate any of the things that it has in mind; it can't demonstrate to the software engineering people on a sufficiently large scale that what it is doing is of interest or importance to them."[29] Ten years later, the Computer Science and Engineering Research Study (COSERS) took stock of the field and its current directions of research and published the results under the title What Can Be Automated?. The committee on theoretical computer science argued forcefully that a process of abstraction was necessary to understand the complex systems constructed on computers and that the abstraction "must rest on a mathematical basis".[30] Defining theoretical computer science as "the field concerned with fundamental mathematical questions about computers, programs, algorithms, and information processing systems in general", the committee acknowledged that those questions tended to follow developments in technology and its application, and hence to aim at a moving target -- strange behavior for mathematical objects.

25. Vannevar Bush, Science, The Endless Frontier: A Report to the Preseident on a Program for Postwar Scientific Research (Washington D.C., 1945; repr. National Science Foundation, 1960, 1990). For a thoughtful critique of Bush’s basic premiss, see Donald E. Stokes, Pasteur’s Quadrant: Basic Science and Technological Innovation (Washington, DC: Brookings Institution Press, 1997).

26. "Towards a mathematical science of computation", Proc. IFIP Congress 62 (Amsterdam: North-Holland, 1963), 21-28; at 21.

27. "A basis for a mathematical theory of computation", in Computer Programming and Formal Systems, ed. P. Braffort and D. Hirschberg (Amsterdam: North-Holland Publishing Company), 33-69; at 69.

28. Interview with M.S. Mahoney, 3 December 1990
.

29. Peter Naur, Brian Randell, and J.N. Buxton (eds.), Software Engineering: Concepts and Techniques. Proceedings of the NATO Conferences (NY: Petrocelli, 1976), 147.

30. They offered for three main reasons: "(1) Computers and programs are inherently mathematical objects. They manipulate formal symbols, and their input-output behavior can be described by mathematical functions. The notations we use to represent them strongly resemble the formal notations which are used throughout mathematics and systematically studied in mathematical logic. (2) Programs often accept arbitrarily large amounts of input data; hence, they have a potentially unbounded number of possible inputs. Thus a program embraces, in finite terms, an infinite number of possible computations; and mathematics provides powerful tools for reasoning about infinite numbers of cases. (3) Solving complex information-processing problems requires mathematical analysis. While some of this analysis is highly problem-dependent and belongs to specific application areas, some constructions and proof methods are broadly applicable, and thus become the subject of theoretical computer science. What Can Be Automated?, ed. Bruce W. Arden (Cambridge, MA: MIT Press, 1980), 139. The committee consisted of Richard M. Karp (Chair; Berkeley), Zohar Manna (Stanford), Albert R. Meyer (MIT), John C. Reynolds (Syracuse), Robert W. Ritchie (Washington), Jeffrey D. Ullman (Stanford), and Shmuel Winograd (IBM Research).

Nonetheless, the committee could identify several broad issues of continuing concern, which were being addressed in the areas of computational complexity, data structures and search algorithms, language and automata theory, the logic of computer programming, and mathematical semantics. In each of these areas, it could point to substantial achievements in bringing some form of mathematics to bear on the central questions of computing. Yet, in the summaries at the end of each section, they repeatedly echoed Christopher Strachey's lament. For all the depth of results in computational complexity, "the complexity of most computational tasks we are familiar with -- such as sorting, multiplying integers or matrices, or finding shortest paths -- is still unknown."(215) Despite the close ties between mathematics and language theory, "by and large, the more mathematical aspects of language theory have not been applied in practice. Their greatest potential service is probably pedagogic, in codifying and given clear economical form to key ideas for handling formal languages."(234) Efforts to bring mathematical rigor to programming quickly reach a level of complexity that makes the techniques of verification subject to the very concerns that prompted their development. Mathematical semantics could show "precisely why [a] nasty surprise can arise from a seemingly well-designed programming language", but not how to eliminate the problems from the outset. As a design tool, mathematical semantics was still far from the goal of correcting the anomalies that gave rise to errors in real programming languages. If computers and programs were "inherently mathematical objects", the mathematics of the computers and programs of real practical concern had so far proved elusive.

Five years later, C.A.R. Hoare echoed the committee's expression of belief and admission of fact. In a postponed Inaugural Lecture as Professor of Computation at Oxford in 1985 (he had been appointed in 1976), Hoare declared,

Our principles may be summarized under four headings.
(1) Computers are mathematical machines. Every aspect of their behavior can be defined with mathematical precision, and every detail can be deduced from this definition with mathematical certainty by the laws of pure logic.
(2) Computer programs are mathematical expressions. They describe with unprecedented precision and in every minutest detail the behaviour, intended or unintended, of the computer on which they are executed.
(3) A programming language is a mathematical theory. It includes concepts, notations, definitions, axioms and theorems, which help a programmer to develop a program which meets its specification, and to prove that it does so.
(4) Programming is a mathematical activity. Like other branches of applied mathematics and engineering, its successful practice requires determined and meticulous application of traditional methods of mathematical understanding, calculation and proof.
These are general philosophical and moral principles, and I hold them to be self-evident -- which is just as well, because all the actual evidence is against them. Nothing is really as I have described it, neither computers nor programs nor programming languages nor even programmers.
In the first three cases, sheer size and complexity stood in the way of mathematical understanding. In the case of programmers, "ignorance or even fear of mathematics" blocked many, while those trained in mathematics did not apply it.[31]
31. C.A.R. Hoare, "The Mathematics of Programming", in his Essays in Computing Science (Hemel Hempstead: Prentice Hall International, 1989), 352.

What should interest the historian of science here is a continuing dissonance between the premisses of theoretical computer science and the experience of programming. It constitutes a prime example of how modern technoscience confounds traditional categories of theory and practice.[32] In principle, the computer should be accessible to mathematics. In practice, it is not.

32. For a discussion of some of the issues, together with case studies, see Andrew Pickering, The Mangle of Practice: Time, Agency, & Science (Chicago: University of Chicago Press, 1995).

There are two features of the situation which have implications for how we might approach the history of software. First, as COSERS itself observed, "[E]ven though all the levels of the hierarchy which computer systems can be interpreted as algorithms, the study of algorithms and the phenomena related to computers are not coextensive, since there are important organizational, policy, and nondeterministic aspects of computing that do not fit the algorithmic mold."[33] The observation raises the questions of what mold those aspects do fit, that is to say, what science, if any, encompasses the phenomena not covered by algorithms. The second feature is the complexity of computer systems that seems to place even their algorithmic aspects beyond the reach of mathematics. The first feature has implications for software as engineering, the second for science as software.
 
Viewing Hoare's principles as ideals to be pursued by ever more rigorous methods risks misleading both the historian and the software engineer. To see why, consider the following variation on the traditional "waterfall" model of software development. Relatively few errors now occur in the stages in the bottom half of the scheme. That is not surprising, given that they are the aspects of computing best understood mathematically, and that understanding has been translated into such practical tools as diagnostic compilers for high-level programming languages.

However, as Fred Brooks pointed out in "No Silver Bullet", the problems thus addressed were accidental, rather than essential, to the task of designing large, complex computer systems.[34] At the top of the scheme, the situation is different. That is where the bulk of the crucial errors have been made, and that is where software engineering has focused its attention since the 1970s. But that is also where the science of software moves away from the computer into the wider world and interacts with the sciences (if they exist) pertinent to the systems to be modeled computationally. There it becomes a question of how to express those sciences computationally and of how to evaluate the fit between the target system and the computational model.[35] But that is a question that software engineers share with scientists who have turned to the computer to take them into realms that are accessible neither to experiment nor to analytical mathematics. Intellectually, professionally, and historically, it links software as science to science as software.

33. COSERS, 9.

34. Frederick P. Brooks, "No Silver Bullet – Essence and Accidents of Software Engineering", Information Processing 86, ed. H.J. Kugler (Amsterdam: Elsevier Science, 1986), 1069-76; repr. in Computer 20,4(1987), 10-19; and in the Anniversary Edition of The Mythical Man-Month: Essays on Software Engineering (Reading, MA: Addison-Wesley, 1995), Chap. 16. Chap. 17, "‘No Silver Bullet’ Refired" is a response to critics of the original article and a review of the silver bullets that have missed the mark over the intervening decade.

35. Since the late 1980s, this interaction has been moved to a meta-level, as researchers have sought to model the process by which the computational model is designed. See, for example, Leon Osterweil's "Software Processes are Software Too", Proceedings: 9th International Conference on Software Engineering (Los Angeles, CA: IEEE Computer Society Press, 1987), 2-13.

Science as software

Recent (and not so recent) trends in computer modeling reverse the perspective of the question of software as science. By the mid-1960s, the Journal of Theoretical Biology was carrying articles on the application of automata to development, and in the early 1970s Aristide Lindenmayer drew on the theory of formal languages to construct models of the growth of plants.[36] His L-systems soon became the topic of a considerable literature. Over the past several years, theoretical biologists Leo Buss and Walter Fontana have used the lambda calculus to model the development through interaction of proteins in an effort to understand how evolution began. In general, the model of a "tape" parsed according to a specified (and specifiable) syntax is fundamental to current thinking in biology, which in a very real sense considers life in terms of software.[37]

Computer modeling in general poses substantive issues for a science of software. Traditionally, models have served the purpose of capturing the workings of phenomena in terms of mechanisms or mathematical relations that are better understood or at least more immediately accessible to manipulation. The empirical fit of the model with the observed behavior of the phenomenon attests to the model's goodness. Understanding how the model has produced that behavior supposedly gives insight into how that behavior was produced in the physical system. Since the 17th century (and even earlier in astronomy), scientists have sought to reduce nature to physical models and the physical models to mathematical relations. They have proceeded on the premiss that the structures of those relations mirrored the structures of the physical models, which in turn mirrored the structures of nature.

36. Aristide Lindenmayer, "Mathematical models for cellular interactions in development", Journal of Theoretical Biology 18(1968), 280-99, 300-15.

37. For a richly detailed and critical account of this development, see Lily Kay, Who Wrote the Book of Life? A History of the Genetic Code (Stanford: Stanford University Press, 2000). Evelyn Fox Keller also explores the metaphor critically in "The Body of a New Machine: Situating the Organism Between the Telegraph and the Computer", in Refiguring Life: Metaphors of Twentieth-Century Biology (New York: Columbia University Press, 1995).

John von Neumann changed that traditional view by arguing against the need for a physical model to mediate between nature and mathematics. Mathematical structures themselves sufficed to give insight into the world, both physical and social. The job of the scientist was to build models that matched the phenomena, without concern for whether the model was "true" in any other sense.

To begin with, we must emphasize a statement which I am sure you have heard before, but which must be repeated again and again. It is that the sciences do not try to explain, they hardly even try to interpret, they mainly make models. By a model is meant a mathematical construct which, with the addition of certain verbal interpretations, describes observed phenomena. The justification of such a mathematical construct is solely and precisely that it is expected to work -that is, correctly to describe phenomena from a reasonably wide area. Furthermore, it must satisfy certain esthetic criteria -that is, in relation to how much it describes, it must be rather simple. I think it is worth while insisting on these vague terms - for instance, on the use of the word rather. One cannot tell exactly how "simple" simple is. Some of the theories that we have adopted, some of the models with which we are very happy and of which we are very proud would probably not impress someone exposed to them for the first time as being particularly simple.[38]
But even then von Neumann assumed that the mathematical structure of the model would be accessible to analysis and the researcher would understand how the model worked.
38. John von Neumann, "Method in the Physical Sciences", in The Unity of Knowledge, ed. L. Leary, (Doubleday, 1955); repr. in JvN, Works, VI, 492.

But here the state of mathematics placed him in a quandary. The current state of mathematics offered little insight into the problems of interest at the time, a class of problems exemplified by hydrodynamics which, he noted in 1945, was "the prototype for anything involving non-linear partial differential equations, particularly those of the hyperbolic or the mixed type, hydrodynamics being a major physical guide in this important field, which is clearly too difficult at present from the purely mathematical point of view."[39] A year later he and Herman Goldstine echoed that theme in their draft paper "On the Principles of Large Scale Computing Machines":

Our present analytical methods seem unsuitable for the solution of the important problems arising in connection with non-linear partial differential equations and, in fact, with virtually all types of non-linear problems in pure mathematics. The truth of this statement is particularly striking in the field of fluid dynamics. Only the most elementary problems have been solved analytically in this field. Furthermore,it seems that in almost all cases where limited successes were obtained with analytical methods, these were purely fortuitous, and not due to any intrinsic suitability of the method to the milieu.... A brief survey of almost any of the really elegant or widely applicable work, and indeed of most of the successful work in both pure and applied mathematics suffices to show that it deals in the main with linear problems. In pure mathematics we need only look at the theories of partial differential and intregral equations, while in applied mathematics we may refer to acoustics, electro-dynamics, and quantum mechanics. The advance of analysis is, at this moment, stagnant along the entire front of non-linear problems.[40]
That is what made the computer so attractive. In the absence of analytic solutions, it could at least provide numerical results and, more importantly, produce them quickly enough to make the mathematics useful as a model.
39. J. von Neumann to Oswald Veblen, 3/26/45; in JvN, Works, VI, 357.

40. H. Goldstine and JvN, "On the Principles of Large-Scale Computing Machines", ca. 1946, JvN, Works, V, 2.

However, the model would only bring insight if one understood how the mathematics worked, if not analytically, at least computationally. But here again, the state of knowledge posed a barrier to understanding, as that approach encountered difficulties. Numerical solutions did not offer the structural insights of the analytical models, making it difficult to determine where the problem lay when the numerical results did not meet expectations. Moreover, it gradually became clear that the numerical techniques developed for the purpose were generating their own problematic behavior, as the truncation and rounding required by finite representation in the machine took calculations in unanticipated directions. To understand that behavior would require a theory of computation that did not yet exist.

As von Neumann pointed out in his 1948 Hixon lecture on the theory of automata,

There exists today a very elaborate system of formal logic, and, specifically, of logic as applied to mathematics. This is a discipline with many good sides, but also with certain serious weaknesses. This is not the occasion to enlarge upon the good sides, which I certainly have no intention to belittle. About the inadequacies, however, this may be said: Everybody who has worked in formal logic will confirm that it is one of the technically most refractory parts of mathematics. The reason for this is that it deals with rigid, all-or-none concepts, and has very little contact with the continuous concept of the real or of the complex number, that is, with mathematical analysis. Yet analysis is the technically most successful and best-elaborated part of mathematics. Thus formal logic is, by the nature of its approach, cut off from the best cultivated portions of mathematics, and forced onto the most difficult part of the mathematical terrain, into combinatorics.
The theory of automata, of the digital, all-or-none type, as discussed up to now, is certainly a chapter in formal logic. It will have to be, from the mathematical point of view, combinatory rather than analytical.[41]
It is important to recall that von Neumann's call for a theory of automata arose not out of concern for programming computers but for using them to model physical systems. The theory was meant to compensate for the failures of analytical mathematics. Although he would not have phrased it so at the time, the science of software was not only about computers; it was about the world.
41. John von Neumann, "On a logical and general theory of automata" in Cerebral Mechanisms in Behavior—The Hixon Symposium, ed. L.A. Jeffries (New York: Wiley, 1951), 1-31; repr. in Papers of John von Neumann on Computing and Computer Theory, ed. William Aspray and Arthur Burks (Cambridge, MA/London: MIT Press; Los Angeles/San Francisco: Tomash Publishers, 1987), 391-431; at 406.

The agenda that von Neumann laid out in his "General Theory" has not received much attention from historians of computing or historians of science. This is not the occasion to try to trace the story in any detail. It leads from von Neumann to Arthur Burks' Logic of Computing Group at the University of Michigan and from there to the Santa Fe Institute. It involves research in cellular automata, complex adaptive systems, genetic algorithms, and similar expressions of what von Neumann characterized as "growing automata". Given new life in the 1980s by the development of computer graphics, the emergence of chaos theory, and the leadership of Stephen Wolfram, it has attracted practitioners from a broad range of sciences.

Yet, even here the relation between what can be done with the computer and what can be accounted for mathematically remains problematic. Almost fifty years after von Neumann wrote of the need for a theory of computation, John Holland, an early member of the Burks group and the creator of genetic algorithms, expressed a similar concern. In the concluding chapter of Hidden Order: How Adaptation Builds Complexity, Holland looks "Toward Theory" and "the general principles that will deepen our understanding of all complex adaptive systems [cas]". As a point of departure he insists that:

Mathematics is our sine qua non on this part of the journey. Fortunately, we need not delve into the details to describe the form of the mathematics and what it can contribute; the details will probably change anyhow, as we close in on our destination. Mathematics has a critical role because it along enables us to formulate rigorous generalizations, or principles. Neither physical experiments nor computer-based experiments, on their own, can provide such generalizations. Physical experiments usually are limited to supplying input and constraints for rigorous models, because the experiments themselves are rarely described in a language that permits deductive exploration. Computer-based experiments have rigorous descriptions, but they deal only in specifics. A well-designed mathematical model, on the other hand, generalizes the particulars revealed by physical experiments, computer-based models, and interdisciplinary comparisons. Furthermore, the tools of mathematics provide rigorous derivations and predictions applicable to all cas. Only mathematics can take us the full distance.[42]
Details aside, Holland's goal, with which he associates his colleagues at the Santa Fe Institute, reflects a vision of mathematics that he and they share with mathematicians from Descartes to von Neumann.

As von Neumann insisted in 1948, the mathematics will be different. To meet Holland's needs it "[will have to] depart from traditional approaches to emphasize persistent features of the far-from-equilibrium evolutionary trajectories generated by recombination."[43] Moreover, that mathematics will be about software. Wolfram's now seminal papers on cellular automata and their application to the "engineering of complexity" rest on the theory of automata and formal languages created during the 1960s through a convergence of agendas in electrical engineering, neurophysiology, mathematical logic, linguistics, computer programming, and abstract algebra. Other contributing branches to the agenda of ALife, in particular from mathematical biology, draw on the same resources in theoretical computer science.

42. John H. Holland, Hidden Order: How Adaptation Builds Complexity (Reading, MA: Addison-Wesley, 1995)161-2.

43. Ibid., 171-2.

As I said at the outset, software should be of great interest to historians of science. But not only because it is new, or even because it represents a new, artifactual form of science. In that case, writing its history would be simply a matter of extending the coverage of the field to include developments over the past fifty years, to be achieved by tacking another chapter onto the story. Much more importantly, as the computer has changed its role in science from tool to instrument to medium (and indeed to surrogate for reality), understanding of the world has come to depend on understanding of computation as itself a complex dynamic process. Scientists and software engineers face many of the same problems of building computational models that reliably simulate portions of the world of interest to them and that do so in ways that allow analysis and understanding of the process. As we make the turn into a new century, software verges on becoming emblematic of science itself. How that happened over the second half of the twentieth century should soon be a big question in the history of science. Having a historically sensitive history of software will help to answer it.

Addendum

Aimed at defining an agenda for historical research, my paper instead provoked considerable discussion about the nature of software and about whether it is or could be a science. The discussion included various assertions, both historical and philosophical, concerning the nature of science and of its relation to mathematics. As lively and provocative as the debate was, I think it led us away from our goal of defining an agenda for the history of software, or at least one aspect of it, and even revealed some misunderstanding of how historians work. The detour may have derived in part from people's not having had a chance to read the paper beforehand and from my having assumed agreement on the meaning of "software" for the purposes of the workshop. Before responding to some of the issues raised in the discussion, let me clarify how I was using "software" and hence what I understood by the phrase "software as science".

Software as Science

In common English usage, "software" is a mass term for programs, for what computers process, as opposed to the machines themselves. When the term first arose in the late 1950s, it was used as the antonym of "hardware".[44] As the Oxford English Dictionary defines it, noting its formation on the model of "hardware", software is "[t]he programs and procedures required to enable a computer to perform a specific task, as opposed to the physical components of the system." By the mid-'60s, the term had taken on a more specific sense of systems software, what people use to construct and run programs. That is what John Tukey had in mind when he introduced the term in 1958.[45] But the term retained its broader meaning; the "software houses" that sprang up in the mid-'60s were producing applications rather than systems.

In my essay "software" means simply programs and the activity of writing them, programming. In speaking then of "software as science" I do not mean to assert that programs in and of themselves constitute a science, or that the writing of them is an inherently scientific activity. Clearly, neither is the case. As I pointed out at the start, programs are artifacts ---literally, things crafted-- and they are no more inherently scientific than any other made object in the world. However, just as other sciences have arisen from the investigation of artifacts, e.g. thermodynamics from the steam engine, so one may ask about a science of programs and programming. Or rather, one may look for efforts by practitioners of computing to make programs and programming the subjects of a scientific inquiry, to place them on a scientific foundation.[46] That is how I construe "software as science" for the purposes of historical inquiry.

44. In introducing the ACM to readers of the first issue of its Journal in January, 1954, S.B. Williams anticipated the coinage: "Until the engineering societies, AIEE and IRE, became sufficiently interested to struggle with 'hardware', the Association provided a forum for all phases of the field. Now the Association can direct its efforts to the other phases of computing systems, such as numerical analysis, logical design, application and use, and, last but not least, to programming." (JACM 1,1[1954], 3).

45. John Tukey, "The Teaching of Concrete Mathematics", American Mathematical Monthly 65,1(1958), 1-9; at 2: "Today the 'software' comprising the carefully planned interpretive routines, compilers, and other aspects of automative[sic] programming are at least as important to the modern electronic calculator as its 'hardware' of tubes, transistors, wires, tapes and the like."

46. Peter J. Landin's abstract of his seminal article, "The mechanical evaluation of expressions" (Computer Journal 6(1964), 308-20), captures the intent and tone of these early efforts: "This paper is a contribution to the 'theory' of the activity of using computers. It shows how some forms of expression used in current programming languages can be modelled in Church's λ-notation, and then describes a way of 'interpreting' such expressions. This suggests a method of analyzing the things computer users write, that applies to many different problem orientations and to different phases of the activity of using a computer. Also a technique is introduced by which the various composite information structures involved can be formally characterized in their essentials, without commitment to specific written or other representations."(308)

To judge from the discussion, people in computing evidently disagree about whether such a science is possible, desirable, or relevant. That is not a matter for historians to decide, nor do historians require consensus on the matter among computer people; indeed, the continuing disagreement is of greater interest than consensus. For historians it is enough that from the mid-1950s people of reputation in computing have believed that a science of software is desirable and feasible and have set forth what they take that science to be. Most have them have looked to mathematics as the foundation. Tony Hoare represents perhaps the extreme in his insistence on the inherently mathematical nature of programs and programming, but he hardly stands alone out there. Others have taken a more empirical approach, believing that mathematics is not adequate to explain what computers can do, especially when we have not told them (or do not believe we have told them) to do it. But, as envisioned by Herbert Simon, the empirical science of computational processes would be no less scientific for being empirical and, indeed, a science of the artificial.

To justify a history of the science of software, it would seem enough to point to the two-volume Handbook of Theoretical Computer Science, of which neither the contents nor even the constituent subjects existed in 1950.[47] The extensive bibliographies accompanying each of the 37 chapters testify to the immense intellectual effort and hence social investment that the Handbook is meant to codify. My sketch of the development of just two of those constituents, formal languages and formal semantics, aimed at suggesting how historians might go about tracing the origins and growth of the field and determining how it attracted the investment. I am fairly confident that the notion of agendas and their convergence on the computer will go a long way in explaining the emergence of such subjects as computational complexity, databases, computational geometry, and parallel computing. Whether or not it does, the Handbook evidently purports to constitute a science of software and to base that science for the most part on mathematics. The job of the historian is not to question whether the Handbook should exist but to explain how it came about.

47. Jan van Leeuwen (ed.), Handbook of Theoretical Computer Science (Amsterdam: Elzevier; Cambridge, MA: MIT Press, 1990), Vol. A: Algorithms and Complexity, Vol. B: Formal Models and Semantics.

The Handbook speaks to another point raised in the discussion. Several in the audience suggested that in speaking of "software as science" I was laboring under a misapprehension rooted in the English use of "computer science" to denote a subject other languages refer to as "informatics". Perhaps the name was leading me to look for science where there was none. Considering that slightly more than half the authors of the Handbook are Europeans engaged in informatics, the objection is puzzling. It is all the more so, since I need only turn to my bookshelf to find a volume, Theoretische Informatik - kurzgefaßt by Uwe Schöning, the contents of which cover roughly the same ground as, say, Computability, Complexity, and Languages: Fundamentals of Theoretical Computer Science by Martin Davis, Ron Sigal, and Elaine J. Weyuker. Turning to Schöning's institutional homepage, the Fakultät für Informatik at the University of Ulm, I see Theoretische Informatik I/II as part of the Grundstudium, side-by-side with Technische Informatik I/II, laying the foundation for further study. The rest of the curriculum does not look substantially different from what is taught in any American department of computer science. So I must wonder, as Shakespeare's Juliet once did, "What's in a name?" Here too the rose smells the same..[48]

48. Only after writing these words did I discover that Wolfgang Coy had asked the same question, “»Informatique«: What's in a Name?“ and had reached the same conclusion in his essay “Defining Discipline”, in Ch. Freksa, M. Jantzen, R. Valk (ed.), Foundations of Computer Science: Potential - Theory - Cognition (Springer: Berlin_Heidelberg_New York et al. 1997): ”But the German »Informatik« made a strange twist: While it uses the French word, it sticks firmly to the American usage of computer science (with elements from computer engineering).“ Together with many colleagues, Coy would like Informatik to be something quite different; but it is not yet, and it has not been so historically.  See, inter alia, his “Für eine Theorie der Informatik!“ in Wolfgang Coy et al. (eds.) Sichtweisen der Informatik (Braunschweig; Wiesbaden: Vieweg, 1992), 17-32.

Science as Software

A major theme of my essay is the inversion suggested in the title. In some instances, the science of computation has laid the groundwork for computational science. What began as an effort to establish the scientific foundations of programs and programming has led to the application of the resulting theory back onto the sciences; the field of cellular automata is a good example. Some members of the workshop felt that in so moving from theoretical computer science to computational science, I was skipping over the sciences on which developers draw in designing software. Critics pointed for example to the various fields of psychology applied to computer-human interfaces, but one might equally well have included the perceptual, optical, and mathematical issues involved in the development of computer graphics. Indeed, the whole field of artificial intelligence would then fall under the scope of software as science. Given that AI was the context for seminal work in theoretical computer science, most notably McCarthy's formal semantics, there is certainly an argument for including it.

However, once one becomes that inclusive, it is hard to see how to distinguish the science of programs and programming from programming itself. As maligned as the dichotomy between "pure" and "applied" has recently become, it perhaps retains some value here. It is a question of focus and motivation. Does one want to understand the nature of programs and programming, or does one write programs for the purpose of achieving or understanding something else? In the latter case, the science applied offers no insight into computational questions. Indeed, as I argue in conclusion, the software more often becomes a means of investigating the science, even to the point of replacing the science's real-world objects.

 

Where one might be able to argue for the incorporation of science into software for the sake of the software is the recent use of biological models for "growing software". Albert Endres alluded to it in his comment on James Tomayko's paper.[49] However, here it is worth noting that the biological models themselves are the product of the science of software. Genetic algorithms are the outgrowth of research into complex adaptive systems, which themselves grow out of the study of cellular automata, which in turn stem from John von Neumann's "General and Logical Theory of Automata". If I may introduce yet another scheme of agendas, here is a sketch of the path that leads from von Neumann to Santa Fe. Note the fork that links theoretical computer science to cellular automata. Clearly, the scheme defines an agenda for the history of software, but it seems to me to be included in the complex of issues to which I was pointing in speaking of science as software.

49. "We certainly cannot expect that the scientific basis of software can come from physics as in the older engineering branches. It may come from mathematics or from biology. Mathematics has clearly brought some help already, be it in cryptology or in program proof automation. Biology is largely untapped." (Below, p.**)

Nature, Science, and Mathematics

In the course of the discussion, participants asserted several principles as indisputable, in particular that science is about nature, not about things of our own creation, and that mathematics is not a science, but a tool for doing science. From the first, some people concluded that, as an artifact, software could not be the subject of a science; from the second, that grounding software in mathematics did not make a science of it. Despite my own insistence on computers and programs as artifacts, neither of these principles is quite as firm as it seems. Much of the work in history of science over the past half-century speaks against them.

The radical change in natural philosophy that we refer to as the Scientific Revolution of the 17th century rested in large part on the rejection of the classical distinction between nature and artifact. The metaphor of the "clockwork universe", which the mechanical philosophy transformed into a metaphysics of matter in motion, placed nature (or rather, nature's God) and human on a par as artisans. "Nature, to be commanded, must be obeyed," decreed Francis Bacon. While he meant thereby to deny supernatural powers to the magician, he also placed the laws of nature at the disposal of humans. Discovering them and then obeying them placed the world at our command. Insisting that "truth and utility are one and the same thing", he placed scientists under the obligation to be able to translate their knowledge into action. That is what enfranchised experiment as a method both of investigation and demonstration. It made the laboratory a surrogate for nature, and it placed nature within the grasp of the instruments employed there. In the laboratory, scientists took nature apart to see how it works and then put a small part of nature back together to testify to their understanding. Is a nuclear bomb a natural phenomenon or a human creation? How about vaccines? How about synthetic drugs? How about gene therapy, or genetic engineering?[50]

50. Herbert Simon uses similar examples to set the theme of his Sciences of the Artificial: "So too we must be careful about equating 'biological' with 'natural'. A forest may be a phenomenon of nature; a farm certainly is not. The very species upon which we depend for our food -our corn and our cattle-- are artifacts of our ingenuity. A plowed field is no more part of nature than an asphalted street -and no less. These examples set the terms of our problem, for those things we call artifacts are not apart from nature. They have no dispensation to ignore or violate natural law. At the same time they are adapted to human goals and purposes. They are what they are in order to satisfy our desire to fly or to eat well. As our aims change, so too do our artifacts -and vice versa. If science is to encompass these objects and phenomena in which human purpose as well as natural law are embodied, it must have means for relating these two disparate components."

A second feature of the mechanical philosophy is pertinent here. It was closely tied to mathematics and thus shared in its certainty. As the new science of machines was transformed into the more universal science of mechanics, couched in the language of analysis (algebra and the calculus), it made Newton's Principia the touchstone for all the sciences, establishing a mathematically structured hierarchy that persists to this day.[51] By the early twentieth century, theoretical physics had all but disappeared into mathematics, leading Eugene Wigner to wonder about "The Unreasonable Effectiveness of Mathematics in the Natural Sciences", and recent philosophy of mathematics has revived the question of the subject's empirical origins.[52]

51. See Michael Mahoney, "The Mathematical Realm of Nature", The Cambridge History of Seventeenth-Century Philosophy, ed. Daniel Garber and Michael Ayers (Cambridge: Cambridge U.P.), Vol 1, 702-755.

52. Wigner's article, based on the Richard Courant Lecture at NYU in 1959, appeared in Communications on Pure and Applied Mathematics 13(1960), 1-14; cf. Mark Steiner's recent book, The Applicability of Mathematics as a Philosophical Problem (Cambridge, MA: Harvard U.P., 1998).

These two features come together in thermodynamics, the mathematical laws of which originated in the steam engine, abstracted by Carnot to the concept of a heat engine. The first and second laws in effect say that we can't build a perpetual-motion machine of the first or second kind. They are negative laws that set limits, without saying much about what can be achieved within those limits. Shannon's mathematical theory of communication similarly built a science out of artifacts, in this case various communications technologies. The theory sets a limit on channel capacity and specifies the tradeoff between redundancy and accuracy. The fruitful interaction between thermodynamics and information theory in explaining physical phenomena, as in the work of Stephen Hawking, seems to work against any effort at distinguishing the natural from the artificial or the mathematical from the physical.

So too with the theory of computation. Does it have laws? Surely Turing's halting theorem sets a limit on what can be computed by demonstrating what cannot be. Showing that a problem is equivalent to the halting problem relegates it to the realm of the incomputable. Similarly, the theory of computational complexity establishes through a variety of models of limited Turing machines what resources are required to compute classes of problems and the tradeoffs between time and space involved in doing so. Are these scientific results about nature? Well, there's a body of literature that says yes. It is a tenet of the new computational sciences that nature can't compute any better than a Turing machine. Or rather, anything nature can do, a computer can do too, given enough time and memory.[53]

53. See Robert Rosen, "Effective Processes and Natural Law", in The Universal Turing Machine: A Half-Century Survey, ed. Rolf Herken (Wien/New York: Springer-Verlag, 1994/5), 485-498; cf. his earlier article, "Church's Thesis and its Relation to the Concept of Realizability in Biology and Physics", Bulletin of Mathematical Biophysics 24(1962), 375-93. [Added 2005] Stephen Wolfram has since made an extensive case for this proposition in A New Kind of Science (Wolfram Media, 2002).

Are these laws of software? In discussing the implications of software as science for software as engineering, I addressed the limits of theoretical computer science in addressing the software development process. That process begins with the translation of a portion of the real world into the first of a series of computational models which culminate in a program running on a specific machine. Verification of the result involves two different issues: the goodness of the computational model with respect to the world it is supposed to model and the accuracy of the translation of that model into the instruction set of the computer on which it is running. Whether the model itself is adequate is ultimately not a question of software but of the developer's understanding of the world. It may be a scientific question, but the science involved is not about software or computers. The science of software as I have construed it pertains to the second issue. It begins where the model becomes a program: how, and to what extent, can we assure ourselves that the program is doing what we have written it to do? How that question has been formulated and addressed is the subject of the history of software as science, or at least that is how I construed my charge.