Final Report: NSF Workshop on Billion-Transistor Systems
March 12-13, 1998
Sponsored by the
National Science FoundationIntroduction
This is the final report of the NSF Workshop on Billion-Transistor Systems, held March 12-13 at Princeton University. The workshop was funded by the National Science Foundation. The goal of the workshop was to study the problems inherent in the design of the huge integrated circuits that VLSI technology will allow us to manufacture over the next few years---we refer to billion-transistor VLSI chips as giga-transistor chips. The specific goals of the workshop were to:
Attendees included: Jacob Abraham (UT Austin), Jason Cong (UCLA), Al Dunlop (Lucent), Pamela Gillis (IBM), Randy Harr (Synopsys), Peter Kogge (Notre Dame), Jim Monzel (IBM), Kunle Olukotun (Stanford U.), Dave Newman (IBM), Lou Scheffer (CADence), Ken Sheppard (Columbia U.), T. R. Viswanathan (TI), Wayne Wolf (Princeton U.). Observing was Robert Grafton of the NSF.
This report (and an executive summary version in PowerPoint format) is available on the World Wide Web at http://www.ee.princeton.edu/~wolf/nsf-workshop. The next section introduces the topic in more detail. Succeeding sections present a series of observations and recommendations about the design of billion-transistor VLSI systems.
Motivation: Giga-Transistor VLSI
Hundred-million transistor chips are already in the works and billion-transistor chips will be feasible soon. The successful design of such VLSI systems requires solving challenges ranging from the analog characteristics of interconnect to system architectures; many of these challenges at different levels of abstraction are often closely related.
Interdisciplinary approaches are at the heart of billion-transistor VLSI because these large chips will be complete systems. In addition to computing elements, they will contain significant amounts of on-chip memory. They will also require interfaces to the ‘real’ world (e.g. analog) interfaces; on-chip processing can significantly improve the effective quality of the integrated analog interfaces. As a result, system designers must be more than logic designers---optimization of system level issues such as stability, failure modes, input validation, etc. will occupy more of designer’s attention.
Billion-transistor chips will have a number of application areas, which we can broadly divide into two categories. Application-specific systems-on-silicon include application-specific systems with complex functionality; examples include single-chip cell phones, automobile control systems, etc. These systems may or may not be high-volume and therefore cost-sensitive: some application areas will support very high volumes at low cost; in other domains, customers may be willing to pay extra for the advantages accrued from higher levels of integration. However, in almost all cases, time-to-market will be critical. Most markets today move very fast and systems houses must be able to design billion-transistor systems on time scales measured in months, not years or decades. General-purpose high-performance computing systems form a second major application area for giga-transistor VLSI. Supercomputing systems of the future will rely on extremely high levels of integration, rather than the exotic packaging of MSI that was common in the 1970’s and 80’s. High levels of integration are the surest road to high performance for large-scale scientific and transaction-based computing systems. High-performance computing applications may lend them selves to more expensive design and manufacturing techniques (high levels of custom design, advanced packaging, etc.) than would be common in systems-on-silicon.
It is reasonable to ask why the world needs integrated circuits with one billion transistors. We believe that giga-transistor VLSI will provide the means to make significant advances in many application domains which will result in enhancemens in the quality of life for many people. Complex integrated systems with on-chip sensing and actuation will be able to provide small, low-power devices which perform important functions. Cellular phones are one example of systems which make extensive use of signal processing to better utilize bandwidth; micromechanical systems (MEMS) can make even more extensive use of on-chip signal processing. Use signal processing and computation to make up for poor external components. Data-intensive applications can run much faster when parallel processing elements are combined with significant amounts of memory on a single chip. As mentioned above, the future of high-performance computation lies in the exploitation of higher levels of integration.
Given this understanding of the importance of billion-transistor systems, it becomes clearer why both the Federal Government and industry should support research in giga-transistor VLSI system design. Many consumer applications can benefit from the larger amounts of memory, on-board signal processing, and overall higher levels of integration which result in lower power consumption. It is important to remember that billion-transistor chips will not just be used for consumer toys in developed economies. Developing countries can leap generations of technology by taking advantage of low-cost, high-performance systems on silicon: instant infrastructure for communicatoins, agricultural monitoring, etc. can all benefit from advances in VLSI technology. Given the potential markets for billion-transistor systems, it is clear that someone will build fabs capable of giga-transistor chips. The United States is certainly competitive in semiconductor manufacturing, and it is clearly the world leader in the design of complex integrated circuits, as exemplified by U. S. companies’ domination of the world microprocessor market. However, current techniques probably will not scale, and we need new methods to cope with the design of future-generation interated circuits. The U. S. can retain leadership in design technology by identifing appropriate research problems in billion-transistor systems and providing adequate funding.
Approaches to Giga-Transistor VLSI Design
In evaluating the field, we made several assumptions about design methodology. First, we assumed that design processes must be (mostly) hierarchical to cope with chip size. Hierarchical design of digital systems has been embraced since the 1970’s, and we see the use of hierachical methods becoming more pervasive in the future. Going along with the general trend toward hierarchy, we assume that designers will use a billion transistors primarily by using pre-defined cores. A hard core is a logic design that has been reduced to layout, while a soft core is chosen on the basis of its logic design only. A firm core is in between the two---it has a layout template but can be modified by parameters and options. A variety of opinions exist on whether soft, firm, or hard cores will be most popular: hardness increases performance and reduces area, while softness makes it easier to port the design to a new process and amortize design costs. Cores represent medium-to-large-scale abstractions of useful, common functionality. In general, we believe that many varieties of intellectual property (IP), including cores, embedded software, bus standards, etc. will be used as elements of systems on silicon. Intellectual property, properly used, can reduce both design time and design risk. Finally, we assume that as time goes on, harder to justify process tuning. The combination of huge fabrication costs, shorter design cycles, and niche markets for some systems-on-silicon applications makes it unlikely that, for example, tuning device thresholds will make sense.
There are some challenges to the introduction of billion-transistor VLSI which lie outside the design process; these are primarily, but not exclusively, non-technical issues. It is important to keep these show-stoppers in mind. One significant risk is that markets may not big enough to support custom chips of this scale. There are significant barriers to acquiring intellectual property for building complex systems-on-silicon. Most systems houses will not be experts in all the components required to build a complete system-on-a-chip; furthermore, market pressures may require them to adopt components developed by other companies. It is quite possible that, in some key markets, no company will be able to acquire enough licenses to build a complete system-on-a-chip. Because components must be obtained from multiple sources, the failure to develop industry-accepted on-chip interfaces and standards can retard the development of the giga-transistor chip. One interesting potential problem with systems-on-silicon is that, unlike printed circuit boards, they are difficult to fix and to upgrade; industry must develop techniques which allow systems-on-silicon to be at least modestly changed after manufacture and in the field. CAD tools, compilers, OS capabilities have not scaled fast enough in the recent past; the unavailability of operating systems which support very large address spaces is, in particular, a problem which must be resolved to keep on the design complexity curve. In general, complexity management is a major problem: algorithms, abstractions, data management, must all advance to be able to handle the huge volumes of design data. Verification is another technology which has not been keeping up with transistor count in recent years; breakthroughs in verification methodologies and algorithms will be required to ensure that billion-transistor chips work. Finally, the power consumption may be excessive; power management and dissipation are issues which must be addressed.
There are some areas of design technology important to billion-transistor systems which have advanced significantly over the past several years. While these problems may not now be completely solved, we believe that they are well in hand. These areas include: noise, power, and timing-driven placement-and-routing; design for low energy and power dissipation; custom core and module development; the intellectual property business model; and interconnect analysis.
Observations and Recommendations
This section makes observations and recommendations on a number of specific areas related to billion-transistor VLSI.
Intellectual Property
As noted above, intellectual property of many varieties will need to be acquired and integrated when designing billion-transistor VLSI. A great deal of IP will be sold as soft macros due to the challenges of porting the modules. However, there will still be a significant place for hard IP. Microprocessors form one important market where the costs of IP porting will probably be sustained by the size of the market. When IP is ported to new processes for large markets, it is likely that the porting will be done by the IP vendor, who has the most expertise in the design of the module, just as most commercial software is ported by the software vendor, not by the user or a third party. The interfaces of IP modules are critical: designers must be able to understand and characterize the modules in order to use them. This implies that abstractions for design and verification are essential, including both low-level and high-level standards. Finally, we believe that business issues in IP management are as important as technical issues.
We believe that the most important research issues in intellectual property for giga-transistor chips are:
Some development issues are also important: the development interfaces which can be used across a sufficiently broad range of IP products; and the development and maintenance of documentation which is consistent with the original design.
Specification and Algorithms
The proper choice of algorithms to implement as systems-on-silicon and the proper specification of these systems is essential to the successful creation of useful systems-on-chips. Most billion-transistor systems will have significant embedded software content, which will handle tasks ranging from core algorithms to error management. Algorithm selection and refinement are important to make effective use of on-chip computation, communication, and memory resources. Systems-on-silicon have much more complex specifications than traditional ICs. It is well-known that specification problems scale non-linearly: designing a system with 100 times more transistors is more than 100 times more difficult. Future generations of systems-on-silicon will be much more than simple controllers: they will in many cases have to handle both complex computations and user interfaces.
We recommend several research problems which are of use in a boad range of billion-transistor systems. The design community needs high-level specification tools that can handle performance, power, etc. as well as functionality. Designers also need better methodologies and algorithms for strong functional verification which takes advantage of design and application hierarchies---as noted above, verification techniques are not keeping pace with the growing complexity of integrated systems. Finally, we believe that effort should be put into the study of interactions between algorithms, architectures, and CAD, since system-level design tools often rely on at least some properties of the type of system being designed.
Architectures and CAD
The architecture of giga-transistor chips is closely related to the design tools for such large systems. We need novel architectures that take advantage of giga-transistors. In particular, successful architectures will not rely on excessive pinout and will make use of on-chip resources: memory, interconnect, and processing elements. We believe that most future systems-on-silicon will make significant use of embedded processors and memory; properly-designed embedded computing systems can improve both performance and size over hard-wired solutions. Given the pervasiveness of embedded processing, architecture and software must be designed together to ensure that computational resources are used effectively. System architecture impacts CAD in several ways: programmability and compilers, co-design, interconnect, etc. are all important CAD topics for giga-transistor VLSI. Finally, we believe that novel architectures can make use of available transistors to solve old problems. For example, fault tolerance and novel algorithms can be used to fight noise and interconnect problems.
We recommend the encouragement of research in architectures for giga-transistor designs. We encourage research efforts which couple architecture and CAD studies. In general, CAD research should be cognizant of architectural and algorithmic work-arounds, and vice versa. Tools for application-specific architectures are an important approach to the success of CAD for integrated systems. Both architects and CAD developers should make early consideration of interconnect-driven architectural designs. Finally, given the importance of updating systems-on-silicon after manufacture, both architects and CAD developers should consider techniques to make large chips reconfigurable.
Customization and Extensibility
As noted above, billion-transistor chips will almost certainly need to be updated after manufacture. Reconfigurability may be useful in making model variations on a shared product platform, bug fixes, and post-manufacture upgrades for products. Reconfigurability is important for bug fixes because late design changes can be made without excessivley perturbing the design and because some bugs will inevitably make it past verification and into products. Reconfigurability may be added to I/O devices, interconnect, and to processing elements. The introduction of pervasive reconfigurability will require advances in both CAD and architecture.
In particular, we need architectures that allow customization and extensibility at reasonable cost in performance, power, etc. Appropriate research topics in on-chip reconfigurable systems include but are not limited to:
Possible techniques requiring study include configurable architectures and software downloading.
Study of architectures which make best use of customization and extensibility is recommended.
Interconnect
Future integrated circuit designs will be driven by interconnect, not transistors. Furthermore, interconnect technology is changing rapidly. As a result, it is critical to develop methodological, modeling, and algorithmic methods which can handle future generations of interconnect. Signal integrity is a major problem today and will become more of a problem. New interconnect technologies, such as copper and low-temperature interconnect, will introduce new problems.
Interconnect-centric design flow is one important area for research. Interconnect-centric design takes into account planning, synthesis, modeling, and verification. Interconnect-centric design flow has powerful implications for core-based design since interconnect can introduce significant coupling between the core and other components. Research in interconnect-driven architecture is appropriate to take advantage of the properties and minimize the limitations of future-generation interconnect. The design community needs ways to characterize and model noise. In general, interconnect should take a holistic view, considering all relevant design metrics such as timing, power, and noise.
Further thoughts on the challenges and opportunities for interconnect design and optimization can be found in a recent report by Jason Cong as part of a collection of SRC working papers on "Frontiers in Semiconductor Research" at http://www.src.org/research/frontier.dgw.
Verification
Design verification is a topic of major importance in the design of giga-transistor VLSI. Functional verification of giga-transistor chps is a major problem. Verification must include not only logical function, but noise, timing, etc. New techniques for verifying properties such as timing, noise, etc. can take advantage of techniques developed originally for combinational/sequential test and automatic test pattern generation (ATPG). Furthermore, verification of these properties---Boolean, timing, and noise analysis---must be coupled both to make the tools more efficient and to give designers more detailed debugging information. Hierarchical solutions are essential to tackling these problems. Current verification techniques do not adequately address non-digital technologies.
We believe that research which combines high-level models, behavioral models, timing, and signal integrity should be encouraged. Successful verification will require better techniques to deal with state space problems. Furthermore, methods combining formal and simulation techniques will be required.
Design-for-Testability
Design-for-testability is a major problem and current solutions don’t scale well. The use of hierarchical design, in particular the introduction of hard hierarchical boundaries, introduces new complications (e.g. heterogeneous approach to test). Analog test is a challenge and future systems-on-silicon will make increasing use of analog components which must be tested along with the digital elements. In giga-transistor chips, the ratio of pins to gates ratio is lower, making it harder to stimulate and observe. New fault mechnaisms, such as crosstalk faults, may be important. Some existing technologies, such as IDDQ, may not scale to giga-transitor systems. Finally, most academic work is done on small examples of questionable relevance.
We recommend that work on test of analog blocks in digital chips should be encourage Research in design-for-testability should support testing crosstalk, signal integrity, etc. One valuable avenue of study is the possible uses of additional resources to help testing problem. Another is the use of on-chip controllers for test.
Novel Technologies
Future systems-on-silicon will include not only digital and analog electronic components but also micromechanical systems and other novel microstructures such as optoelectronic and fluidic components. These novel technologies can provide markets for systems on silicon, provide superior solutions for some subsystems (for example, micromechanical resonators for integrated radios), and can themselves be improved by taking advantage of on-chip signal processing. Alternative architectures which make use of non-electronic components may be able to provide low power, low cost, and integrated sensors and actuators.
Research is needed to develop CAD methods for complete systems, including mechanical, fluidic, optical, and other components. Such CAD methods will include but are not limited to:
Education and Training
Education and training will be essential to the development of giga-transistor VLSI systems and U. S. leadership in this critical technology. Industry needs both designers and CAD engineers to build such complex systems. Both designers and CAD developers must understand complexity to be able to cope with huge designs. As a result, industry will need trained people who understand more than strict specializations. While we expect designers to take specializations in which they are especially competent, successful designers and CAD engineers will not wear blinders---they will understand and be able to cope with problems engendered in other phases of the design process.
We encourage development of courses that teach the design of complete systems, not just digital logic. Such courses are also well-suited to the teaching of the management of complexity, which will be a survival skill for next-generation CAD engineers and designers. Students need experience with complex system designs to be able to learn these skills. They need access to reference designs that can teach them best practices. Education and training should encourage boundary-crossing---we should provide CAD people with design experience and architects with CAD experience. Finally, lifelong learning is essential. CAD developers and designers will need to learn new skills as their careers progress. Lifelong learning should be supported by both employers and by educational institutions.
Research Enablers
Research enablers are items which are important to research but are not themselves research products; in short, they are useful to many people but will get tenure for no assistant professor. There are several important items which benefit the community and require significant contributions from motivated individuals. Industrial-strength benchmarks are important. Documentation for designs and tools is also important---this includes not only detailed documentation but also more general descriptions such as effective architectures for particular types of systems. As systems grow larger, so will documentation, so the creation of document management tools and methodologies will be important; document management tools be built on top of the Web.
In particular, we see the need for several types of research enablers which will aid in the development of new technologies and the educatoin of trained engineers. First, we need realistic benchmarks, either university/research lab-designed or contributed by industry. Industrial sources which hope to make use of university research should be willing to contribute designs which can enable and guide that research. Second, we need mechanisms for benchmark creation. This can be done either by university/industry partnerships or by multi-university projects. Finally, we need predictions for future technology, including component parameters and sample designs.