Please check out my research group's webpage. Please check out our new manycore microprocessor the Princeton Piton Processor which we have just open sourced as the OpenPiton Project. Please check out the PriME Manycore Simulator.

I am a faculty member in the Electrical Engineering Department at Princeton University. I am looking to recruit Graduate and Undergraduate students interested in Computer Architecture, Operating Systems, Cloud Computing Systems, and Circuits. My thesis research was on how to scale up operating systems for multicore and cloud computers. I call this work a Factored Operating System (fos). I am also a co-founder of Tilera Corporation where I served as the Lead Architect and designed the architecture of the TILE64 and TILEPRO64 families of processors.


( Curriculum Vitae )

My primary research interest is in parallel computer architecture and operating systems for parallel computers. I like to take a holistic approach to my research and strive to co-optimize both software and hardware components of computer systems. I think it is through understanding the whole system and being prepared to modify any portion of the system that the largest gains can be made. By doing research across multiple layers of abstraction I believe that truly revolutionary discoveries can be made. I also believe that researchers should build prototype systems in order to test out their ideas. Through building experimental systems, I have gained insight into computing systems. Many times the real problem that needs to be solved is not apparent until a full system is constructed. Often, the true research is in the details of building working systems. This has motivated me in my career and is why I have contributed in the design of three fabricated microprocessors and one prototype operating system.

Multicore and Cloud Operating Systems

Cloud computers and multicore processors are two emerging classes of computational hardware that have the potential to provide unprecedented compute capacity to the average user. In order for the user to effectively harness all of this computational power, operating systems (OSes) for these new hardware platforms are needed. Existing multicore operating systems do not scale to large numbers of cores, and do not support clouds. Consequently, current day cloud systems push much complexity onto the user, requiring the user to manage individual Virtual Machines (VMs) and deal with many system-level concerns. I am researching how to construct operating systems for future multicore processors and current day Infrastructure as a Service (IaaS) cloud systems. I founded the Factored Operating System (fos) project at MIT. fos tackles OS scalability challenges by factoring the OS into its component system services. Each system service is further factored into a collection of Internet-inspired servers which communicate via messaging. We call this set of system servers which collaborate to provide a single system service a fleet. Fleets can grow and shrink in response to demand. fos takes into account spatial placement of user and system services to best optimize performance. fos provides a single system image across multiple physical and virtual machines. I designed fos's microkernel, messaging API, remote memory access API, implemented our first inter-machine proxy server, deployed fos on Amazon's EC2, and am currently designing our distributed object model to facilitate the construction of operating system server fleets. I published our foundational ideas of fos in [ACM Operating System Review] and we elaborated on the cloud aspects of fos at [SoCC]. The fos team contains a very talented group of nine graduate students, postdocs, and undergraduates. fos Overview Image
(fos project page)

Tilera Multicore Processor

In December 2004 I co-founded Tilera Corporation. While at Tilera, I was Lead Architect and designed the architecture of the TILE64 and TILEPRO64 processor families. I also designed our instruction set architecture, exception architecture, and various other subsystems. I also had the opportunity to implement our main processor pipeline. Tilera is full of extremely talented engineers. While this work was done at a small company, much of the work we did was breaking new barriers as we pioneered new ideas as no one had built a multicore processor to the scale that we had. We explored ideas in designing scalable memory systems, implementing I/O hardware in software, and multicore virtualization. We announced the first TILE processor at [HotChips], I published our work in [IEEE Micro], and our chip implementation was presented at [ISSCC]. Packaged Tilera TILE64 Processor
(more Tilera photos)

Parallel and Parallelizing Dynamic Binary Translator

Using the Raw Microprocessor as a parallel fabric, I worked on creating a parallel dynamic binary translator. This parallel dynamic binary translator takes in x86 Linux binaries and executes them on the Raw microprocessor. The Raw Microprocessor has a MIPS-like ISA so cross architecture dynamic translation was needed. I used the x86 parser out of Valgrind as a basis for this translator. This dynamic translator executed the translation step in parallel and speculatively. By doing so, code which had not yet been executed could be pre-translated thereby saving the first time translation cost. Also, the translator was able to take advantage of the parallelism of the Raw Microprocessor. I also used the Raw Microprocessor spatially as a fabric to build different virtual microprocessor topologies. For example, the translator can dynamically trade off cores that are used as data cache for more translation resources. Finally, the translator provided virtual memory on an architecture without any virtual memory via sandboxing. I really enjoyed this project as I learned how to build a best-of-breed dynamic binary translator and learned about backend compiler optimizations which I needed to implement to enable fast execution. I published this in [CGO]. Block Diagram of Dynamic Binary Translator from x86 to Raw
(slides from CGO)

Master's Thesis - Comparison of Multicores, ASICs, and FPGAs

I have always been interested in why different computational fabrics are better or worse for particular applications. I have also wondered just how much better an ASIC is than an FPGA and how much better an FPGA is than a microprocessor for implementing a particular algorithm. I have heard many rules of thumb, but have never seen hard numbers. To that end, for my master's thesis I did a quantitative study of the difference in performance and area of implementing bit-level communication algorithms on a microprocessor, tiled multicore, FPGA, and ASIC. The results were a little bit surprising. I found that ASICs provided 2-3x absolute performance improvement over a FPGA, and FPGAs provided 2-3x absolute performance improvement over a microprocessor. ASICs and FPGAs really shined when it came to silicon area improvements. I found that ASIC designs utilized 5-6 base 10 orders of magnitude less area than software on a microprocessor and FPGAs used 2-3 orders of magnitude less area than software on a microprocessor. My [Master's Thesis] has more detailed results and this work appeared in a shortened form at [FCCM]. Mapping of 802.11a convolutional encoder onto Raw
(Master's Thesis)

Raw Multicore Microprocessor

The Raw Multicore Microprocessor is a 16 core homogeneous multicore processor designed at MIT. The Raw Microprocessor explored many ideas in how to design large scale multicore processors, different parallel programming paradigms, different parallel compilation techniques, and how best to have multiple cores communicate. I designed the Raw Microprocessors dynamic networks, many of the testing structures, and parts of the ALU including a novel population count unit. I contributed to chip verification. After the chip was fabricated, I contributed to chip bringup. I also designed much of the FPGA support logic around the chip including our FPGA synthesis methodology, design of our FPGA network interface logic, the initial test harness, and contributed to board design. I really enjoyed the Raw design experience as I learned what it takes to fabricate real chips in Academia. Also, I learned what it is like to work in a group with a large goal. We evaluated the Raw Microprocessor in [ISCA, IEEE Micro, ISSCC], presented Raw at [HotChips], and wrote many more publications. Die Micrograph of the Raw Multicore Microprocessor
(more Raw photos)

Other Projects

In addition to my primary research, I have had the opportunity to conduct research in my graduate coursework. Several of my graduate courses included semester long research projects:
  • MIT 6.836 - Kickbot: A Spherical Autonomous Robot [paper]
  • MIT 6.892 (6.899) - Keyword Search for Freenet [paper]
Kickbot Spherical Robot
(kickbot paper)


I have been involved in teaching during my graduate career in both formal and informal capacities. I enjoy helping students and find it fulfilling when I see the inspiration moment when they understand a new concept. I have taught recitation sections for the introductory digital logic design course at MIT. I informally was the teaching assistant for the parallel computer architecture course. I helped teach a student-run month-long robotics course. I have co-supervised four MIT Master's students. I guest lecture on dynamic binary translation in MIT's computer architecture course. And, I teach people how to hike, climb, and camp during MIT Outing Club's (hiking club) Winter School.

6.004 Computation Structures

Teaching Assistant, Spring 2009
In this course, we teach sophomore undergraduates digital logic design and introductory computer architecture. I am very passionate about this course because a similar course in my undergraduate got me hooked on computer architecture. I always find it amazing how combinational logic can be put together with state holding elements to produce fully functional state machines. And then state machines can be put together with a datapath and storage to create programmable computers. I was a Teaching Assistant for this course under the guidance of Professor Steve Ward. In 6.004, I taught two recitation sections which met twice a week, held laboratory hours, and graded exams.

MIT Outing Club (hiking club) Winter School

Instructor 1/2008, 1/2009, 1/2010; Lead Organizer 1/2009
MIT's Outing Club annually teaches a course during independent activities period (January Term) about hiking, camping, climbing, skiing, and enjoying winter in the outdoors. I have taught selected lectures for three years in front of a class of 180 students. Also, during the weekends in January, as a Winter School leader, I lead trips of 8 students on hiking, camping, and climbing trips in White Mountain National Forest. In 2009, I organized the logistics for Winter School. MIT 6.186 Robot
(photos from the first weekend of Winter School 2010)

6.846 Parallel Computing

Informal Teaching Assistant, Spring 2002
6.846 teaches parallel computer architecture and how to map parallel programs onto different parallel architectures. I was an informal Teaching Assistant for this course in Spring 2002. I created and organized the final project for the course which involved using the Raw Microprocessor environment to implement a parallel implementation of acoustic beamforming. I also guest lectured selected lectures in this course on dynamic network design for multicore processors.

6.186 Mobile Autonomous Systems Laboratory

Instructor, 2001
In 2001, I was part of a small team of graduate students who ran an autonomous robotics course. This course is run during MIT's month-long independent activities period. In this course students learn to build autonomous vision-based robots. There is a competition at the end of the month where the robots compete. One of the reasons I got involved with this 6.186 is that I was a participant in the Jerry Sanders robotic competition in my undergraduate. I thought this was a great experience and I wanted to share my knowledge of building robots and system design. MIT 6.186 Robot
(class website)


  • Massachusetts Institute of Technology, Cambridge, MA
    Ph.D. in Electrical Engineering and Computer Science, February 2012
    • Thesis: "fos: A Factored Operating System for 1000+ Core Microprocessors"
  • Massachusetts Institute of Technology, Cambridge, MA
    M.S. in Electrical Engineering and Computer Science, September 2002
  • University of Illinois at Urbana-Champaign, Urbana, IL
    B.S. in Electrical Engineering, May 2000
    • Minor: Computer Science

Research Sponsors


A passion of mine is climbing mountains. I enjoy hiking, skiing, rock climbing, camping, and most other outdoor endeavors. I am slowly learning how to be a competent mountaineer. I teach at the MIT Outing Club's (hiking club) annual Winter School where we teach approximately 150-180 students how to hike, climb, camp, and enjoy the outdoors in winter. In January 2009 I was co-organizer of the whole course, but these days, I typically just help out by teaching lectures and leading trips. Following are some selected adventures.

Grand Teton

In July 2010, I climbed the Grand Teton. We took the Owen-Spalding route with the Wittich Crack (5.6) variation. My climbing partner for this trip was Patrick Lam. On our first summit attempt day, a thunderstorm rolled in while we were on the upper mountain. We successfully retreated, but others we not as lucky. We successfully summited two days later. Dave Standing on near the Black Dike on the Grad Teton with Middle Teton in Background
(more photos)

Winter Katahdin

In the first week of March 2010, myself and four other intrepid souls decided we should climb Mt. Katahdin in Maine. Unfortunately, we decided to do this during the end of a Nor'easter (the winter version of a tropical storm). The approach to Mt. Katahdin in the winter is very long and required us to cross country ski in while pulling sleds (polks). Due to the fresh 30 inches of snow on top of the 100 inches of pre-existing snow, the avalanche conditions were quite high, so we ended up sticking to ridge routes which was not our original plans. We successfully summited Hamlin and Baxter Peaks (Baxter is the high-point) of the Katahdin Massif. We also enjoyed some downhill skiing, ice climbing, and backcountry winter camping. Group Photo in front of the Mt. Katahdin, Baxter Peak sign
(more photos)

Mt. Rainier

In July 2009 Aaron Yahr, Patrick Lam, and I successfully summited Mt. Rainier via the Emmons Glacier route. This was the second attempt for me after an unsuccessful summit attempt due to weather in the summer of 2008. Mt. Rainier is a great mountaineering adventure that I highly recommend. It is heavily glaciated and the Emmons Glacier route requires some good route finding skills in order to not get lost. A lesson we learned from this trip include not trying to climb too high too fast. We tried to summit the day after we arrived at high-camp. This was a bad idea as we were not acclimatized. We turned back to high-camp and summited the next day. Dave looking lost at altitude on Mt. Rainier (approx 11500ft)
(more photos)

New Hampshire 4000 Footers

A common goal for avid hikers in New England is to hike all forty-eight mountains in New Hampshire which are taller than 4000 ft. above sea level. The AMC even has a club devoted to this. I completed hiking all of the 4000 footers in July 2007. I think such clubs or games is a good idea because it encourages avid hikers to hike mountains besides the "beautiful" ones and evens out trail wear. I am now working on hiking all of the NH 4000 footers in the winter, a much harder goal. Dave on summit of Mt Waumbek, his last 4000 ft. peak
(more photos)


In Fall 2008 I became interested in answering the simple question of, "How much is a click on an web advertisement really worth?" To that end, I have created several tourism websites to explore Internet advertising and to understand how search engines operate. I am also the website maintainer for the MIT Outing Club (hiking club).