Category Archives: Computer

Systems In today’s computers can predict chemical reaction products

When organic chemists identify a useful chemical compound — a new drug, for instance — it’s up to chemical engineers to determine how to mass-produce it.

There could be 100 different sequences of reactions that yield the same end product. But some of them use cheaper reagents and lower temperatures than others, and perhaps most importantly, some are much easier to run continuously, with technicians occasionally topping up reagents in different reaction chambers.

Historically, determining the most efficient and cost-effective way to produce a given molecule has been as much art as science. But MIT researchers are trying to put this process on a more secure empirical footing, with a computer system that’s trained on thousands of examples of experimental reactions and that learns to predict what a reaction’s major products will be.

The researchers’ work appears in the American Chemical Society’s journal Central Science. Like all machine-learning systems, theirs presents its results in terms of probabilities. In tests, the system was able to predict a reaction’s major product 72 percent of the time; 87 percent of the time, it ranked the major product among its three most likely results.

“There’s clearly a lot understood about reactions today,” says Klavs Jensen, the Warren K. Lewis Professor of Chemical Engineering at MIT and one of four senior authors on the paper, “but it’s a highly evolved, acquired skill to look at a molecule and decide how you’re going to synthesize it from starting materials.”

With the new work, Jensen says, “the vision is that you’ll be able to walk up to a system and say, ‘I want to make this molecule.’ The software will tell you the route you should make it from, and the machine will make it.”

With a 72 percent chance of identifying a reaction’s chief product, the system is not yet ready to anchor the type of completely automated chemical synthesis that Jensen envisions. But it could help chemical engineers more quickly converge on the best sequence of reactions — and possibly suggest sequences that they might not otherwise have investigated.

Jensen is joined on the paper by first author Connor Coley, a graduate student in chemical engineering; William Green, the Hoyt C. Hottel Professor of Chemical Engineering, who, with Jensen, co-advises Coley; Regina Barzilay, the Delta Electronics Professor of Electrical Engineering and Computer Science; and Tommi Jaakkola, the Thomas Siebel Professor of Electrical Engineering and Computer Science.

Acting locally

A single organic molecule can consist of dozens and even hundreds of atoms. But a reaction between two such molecules might involve only two or three atoms, which break their existing chemical bonds and form new ones. Thousands of reactions between hundreds of different reagents will often boil down to a single, shared reaction between the same pair of “reaction sites.”

A large organic molecule, however, might have multiple reaction sites, and when it meets another large organic molecule, only one of the several possible reactions between them will actually take place. This is what makes automatic reaction-prediction so tricky.

In the past, chemists have built computer models that characterize reactions in terms of interactions at reaction sites. But they frequently require the enumeration of exceptions, which have to be researched independently and coded by hand. The model might declare, for instance, that if molecule A has reaction site X, and molecule B has reaction site Y, then X and Y will react to form group Z — unless molecule A also has reaction sites P, Q, R, S, T, U, or V.

It’s not uncommon for a single model to require more than a dozen enumerated exceptions. And discovering these exceptions in the scientific literature and adding them to the models is a laborious task, which has limited the models’ utility.

One of the chief goals of the MIT researchers’ new system is to circumvent this arduous process. Coley and his co-authors began with 15,000 empirically observed reactions reported in U.S. patent filings. However, because the machine-learning system had to learn what reactions wouldn’t occur, as well as those that would, examples of successful reactions weren’t enough.

Negative examples

So for every pair of molecules in one of the listed reactions, Coley also generated a battery of additional possible products, based on the molecules’ reaction sites. He then fed descriptions of reactions, together with his artificially expanded lists of possible products, to an artificial intelligence system known as a neural network, which was tasked with ranking the possible products in order of likelihood.

From this training, the network essentially learned a hierarchy of reactions — which interactions at what reaction sites tend to take precedence over which others — without the laborious human annotation.

Other characteristics of a molecule can affect its reactivity. The atoms at a given reaction site may, for instance, have different charge distributions, depending on what other atoms are around them. And the physical shape of a molecule can render a reaction site difficult to access. So the MIT researchers’ model also includes numerical measures of both these features.

According to Richard Robinson, a chemical-technologies researcher at the drug company Novartis, the MIT researchers’ system “offers a different approach to machine learning within the field of targeted synthesis, which in the future could transform the practice of experimental design to targeted molecules.”

“Currently we rely heavily on our own retrosynthetic training, which is aligned with our own personal experiences and augmented with reaction-database search engines,” Robinson says. “This serves us well but often still results in a significant failure rate. Even highly experienced chemists are often surprised. If you were to add up all the cumulative synthesis failures as an industry, this would likely relate to a significant time and cost investment. What if we could improve our success rate?”

The MIT researchers, Robinson says, “have cleverly demonstrated a novel approach to achieve higher predictive reaction performance over conventional approaches. By augmenting the reported literature with negative reaction examples, the data set has more value.”

3-D Chip Has combined computing and data storage in Computer

As embedded intelligence is finding its way into ever more areas of our lives, fields ranging from autonomous driving to personalized medicine are generating huge amounts of data. But just as the flood of data is reaching massive proportions, the ability of computer chips to process it into useful information is stalling.

Now, researchers at Stanford University and MIT have built a new chip to overcome this hurdle. The results are published today in the journal Nature, by lead author Max Shulaker, an assistant professor of electrical engineering and computer science at MIT. Shulaker began the work as a PhD student alongside H.-S. Philip Wong and his advisor Subhasish Mitra, professors of electrical engineering and computer science at Stanford. The team also included professors Roger Howe and Krishna Saraswat, also from Stanford.

Computers today comprise different chips cobbled together. There is a chip for computing and a separate chip for data storage, and the connections between the two are limited. As applications analyze increasingly massive volumes of data, the limited rate at which data can be moved between different chips is creating a critical communication “bottleneck.” And with limited real estate on the chip, there is not enough room to place them side-by-side, even as they have been miniaturized (a phenomenon known as Moore’s Law).

To make matters worse, the underlying devices, transistors made from silicon, are no longer improving at the historic rate that they have for decades.

The new prototype chip is a radical change from today’s chips. It uses multiple nanotechnologies, together with a new computer architecture, to reverse both of these trends.

Instead of relying on silicon-based devices, the chip uses carbon nanotubes, which are sheets of 2-D graphene formed into nanocylinders, and resistive random-access memory (RRAM) cells, a type of nonvolatile memory that operates by changing the resistance of a solid dielectric material. The researchers integrated over 1 million RRAM cells and 2 million carbon nanotube field-effect transistors, making the most complex nanoelectronic system ever made with emerging nanotechnologies.

The RRAM and carbon nanotubes are built vertically over one another, making a new, dense 3-D computer architecture with interleaving layers of logic and memory. By inserting ultradense wires between these layers, this 3-D architecture promises to address the communication bottleneck.

However, such an architecture is not possible with existing silicon-based technology, according to the paper’s lead author, Max Shulaker, who is a core member of MIT’s Microsystems Technology Laboratories. “Circuits today are 2-D, since building conventional silicon transistors involves extremely high temperatures of over 1,000 degrees Celsius,” says Shulaker. “If you then build a second layer of silicon circuits on top, that high temperature will damage the bottom layer of circuits.”

The key in this work is that carbon nanotube circuits and RRAM memory can be fabricated at much lower temperatures, below 200 C. “This means they can be built up in layers without harming the circuits beneath,” Shulaker says.

This provides several simultaneous benefits for future computing systems. “The devices are better: Logic made from carbon nanotubes can be an order of magnitude more energy-efficient compared to today’s logic made from silicon, and similarly, RRAM can be denser, faster, and more energy-efficient compared to DRAM,” Wong says, referring to a conventional memory known as dynamic random-access memory.

“In addition to improved devices, 3-D integration can address another key consideration in systems: the interconnects within and between chips,” Saraswat adds.

“The new 3-D computer architecture provides dense and fine-grained integration of computating and data storage, drastically overcoming the bottleneck from moving data between chips,” Mitra says. “As a result, the chip is able to store massive amounts of data and perform on-chip processing to transform a data deluge into useful information.”

To demonstrate the potential of the technology, the researchers took advantage of the ability of carbon nanotubes to also act as sensors. On the top layer of the chip they placed over 1 million carbon nanotube-based sensors, which they used to detect and classify ambient gases.

Due to the layering of sensing, data storage, and computing, the chip was able to measure each of the sensors in parallel, and then write directly into its memory, generating huge bandwidth, Shulaker says.

Three-dimensional integration is the most promising approach to continue the technology scaling path set forth by Moore’s laws, allowing an increasing number of devices to be integrated per unit volume, according to Jan Rabaey, a professor of electrical engineering and computer science at the University of California at Berkeley, who was not involved in the research.

“It leads to a fundamentally different perspective on computing architectures, enabling an intimate interweaving of memory and logic,” Rabaey says. “These structures may be particularly suited for alternative learning-based computational paradigms such as brain-inspired systems and deep neural nets, and the approach presented by the authors is definitely a great first step in that direction.”

“One big advantage of our demonstration is that it is compatible with today’s silicon infrastructure, both in terms of fabrication and design,” says Howe.

“The fact that this strategy is both CMOS [complementary metal-oxide-semiconductor] compatible and viable for a variety of applications suggests that it is a significant step in the continued advancement of Moore’s Law,” says Ken Hansen, president and CEO of the Semiconductor Research Corporation, which supported the research. “To sustain the promise of Moore’s Law economics, innovative heterogeneous approaches are required as dimensional scaling is no longer sufficient. This pioneering work embodies that philosophy.”

The team is working to improve the underlying nanotechnologies, while exploring the new 3-D computer architecture. For Shulaker, the next step is working with Massachusetts-based semiconductor company Analog Devices to develop new versions of the system that take advantage of its ability to carry out sensing and data processing on the same chip.

So, for example, the devices could be used to detect signs of disease by sensing particular compounds in a patient’s breath, says Shulaker.

“The technology could not only improve traditional computing, but it also opens up a whole new range of applications that we can target,” he says. “My students are now investigating how we can produce chips that do more than just computing.”

“This demonstration of the 3-D integration of sensors, memory, and logic is an exceptionally innovative development that leverages current CMOS technology with the new capabilities of carbon nanotube field–effect transistors,” says Sam Fuller, CTO emeritus of Analog Devices, who was not involved in the research. “This has the potential to be the platform for many revolutionary applications in the future.”

This work was funded by the Defense Advanced Research Projects Agency, the National Science Foundation, Semiconductor Research Corporation, STARnet SONIC, and member companies of the Stanford SystemX Alliance.

Using chip memory is more efficient and Faster

For decades, computer chips have increased efficiency by using “caches,” small, local memory banks that store frequently used data and cut down on time- and energy-consuming communication with off-chip memory.

Today’s chips generally have three or even four different levels of cache, each of which is more capacious but slower than the last. The sizes of the caches represent a compromise between the needs of different kinds of programs, but it’s rare that they’re exactly suited to any one program.

Researchers at MIT’s Computer Science and Artificial Intelligence Laboratory have designed a system that reallocates cache access on the fly, to create new “cache hierarchies” tailored to the needs of particular programs.

The researchers tested their system on a simulation of a chip with 36 cores, or processing units. They found that, compared to its best-performing predecessors, the system increased processing speed by 20 to 30 percent while reducing energy consumption by 30 to 85 percent.

“What you would like is to take these distributed physical memory resources and build application-specific hierarchies that maximize the performance for your particular application,” says Daniel Sanchez, an assistant professor in the Department of Electrical Engineering and Computer Science (EECS), whose group developed the new system.

“And that depends on many things in the application. What’s the size of the data it accesses? Does it have hierarchical reuse, so that it would benefit from a hierarchy of progressively larger memories? Or is it scanning through a data structure, so we’d be better off having a single but very large level? How often does it access data? How much would its performance suffer if we just let data drop to main memory? There are all these different tradeoffs.”

Sanchez and his coauthors — Po-An Tsai, a graduate student in EECS at MIT, and Nathan Beckmann, who was an MIT graduate student when the work was done and is now an assistant professor of computer science at Carnegie Mellon University — presented the new system, dubbed Jenga, at the International Symposium on Computer Architecture last week.

Staying local

For the past 10 years or so, improvements in computer chips’ processing power have come from the addition of more cores. The chips in most of today’s desktop computers have four cores, but several major chipmakers have announced plans to move to six cores in the next year or so, and 16-core processors are not uncommon in high-end servers. Most industry watchers assume that the core count will continue to climb.

Each core in a multicore chip usually has two levels of private cache. All the cores share a third cache, which is actually broken up into discrete memory banks scattered around the chip. Some new chips also include a so-called DRAM cache, which is etched into a second chip that is mounted on top of the first.

For a given core, accessing the nearest memory bank of the shared cache is more efficient than accessing more distant cores. Unlike today’s cache management systems, Jenga distinguishes between the physical locations of the separate memory banks that make up the shared cache. For each core, Jenga knows how long it would take to retrieve information from any on-chip memory bank, a measure known as “latency.”

Jenga builds on an earlier system from Sanchez’s group, called Jigsaw, which also allocated cache access on the fly. But Jigsaw didn’t build cache hierarchies, which makes the allocation problem much more complex.

For every task running on every core, Jigsaw had to calculate a latency-space curve, which indicated how much latency the core could expect with caches of what size. It then had to aggregate all those curves to find a space allocation that minimized latency for the chip as a whole.

Curves to surfaces

But Jenga has to evaluate the tradeoff between latency and space for two layers of cache simultaneously, which turns the two-dimensional latency-space curve into a three-dimensional surface. Fortunately, that surface turns out to be fairly smooth: It may undulate, but it usually won’t have sudden, narrow spikes and dips.

That means that sampling points on the surface will give a pretty good sense of what the surface as a whole looks like. The researchers developed a clever sampling algorithm tailored to the problem of cache allocation, which systematically increases the distances between sampled points. “The insight here is that caches with similar capacities — say, 100 megabytes and 101 megabytes — usually have similar performance,” Tsai says. “So a geometrically increased sequence captures the full picture quite well.”

Once it has deduced the shape of the surface, Jenga finds the path across it that minimizes latency. Then it extracts the component of that path contributed by the first level of cache, which is a 2-D curve. At that point, it can reuse Jigsaw’s space-allocation machinery.

In experiments, the researchers found that this approach yielded an aggregate space allocation that was, on average, within 1 percent of that produced by a full-blown analysis of the 3-D surface, which would be prohibitively time consuming. Adopting the computational short cut enables Jenga to update its memory allocations every 100 milliseconds, to accommodate changes in programs’ memory-access patterns.

End run

Jenga also features a data-placement procedure motivated by the increasing popularity of DRAM cache. Because they’re close to the cores accessing them, most caches have virtually no bandwidth restrictions: They can deliver and receive as much data as a core needs. But sending data longer distances requires more energy, and since DRAM caches are off-chip, they have lower data rates.

If multiple cores are retrieving data from the same DRAM cache, this can cause bottlenecks that introduce new latencies. So after Jenga has come up with a set of cache assignments, cores don’t simply dump all their data into the nearest available memory bank. Instead, Jenga parcels out the data a little at a time, then estimates the effect on bandwidth consumption and latency. Thus, even within the 100-millisecond intervals between chip-wide cache re-allocations, Jenga adjusts the priorities that each core gives to the memory banks allocated to it.

“There’s been a lot of work over the years on the right way to design a cache hierarchy,” says David Wood, a professor of computer science at the University of Wisconsin at Madison. “There have been a number of previous schemes that tried to do some kind of dynamic creation of the hierarchy. Jenga is different in that it really uses the software to try to characterize what the workload is and then do an optimal allocation of the resources between the competing processes. And that, I think, is fundamentally more powerful than what people have been doing before. That’s why I think it’s really interesting.”

Digital design is very wide

Virtually any modern information-capture device — such as a camera, audio recorder, or telephone — has an analog-to-digital converter in it, a circuit that converts the fluctuating voltages of analog signals into strings of ones and zeroes.

Almost all commercial analog-to-digital converters (ADCs), however, have voltage limits. If an incoming signal exceeds that limit, the ADC either cuts it off or flatlines at the maximum voltage. This phenomenon is familiar as the pops and skips of a “clipped” audio signal or as “saturation” in digital images — when, for instance, a sky that looks blue to the naked eye shows up on-camera as a sheet of white.

Last week, at the International Conference on Sampling Theory and Applications, researchers from MIT and the Technical University of Munich presented a technique that they call unlimited sampling, which can accurately digitize signals whose voltage peaks are far beyond an ADC’s voltage limit.

The consequence could be cameras that capture all the gradations of color visible to the human eye, audio that doesn’t skip, and medical and environmental sensors that can handle both long periods of low activity and the sudden signal spikes that are often the events of interest.

The paper’s chief result, however, is theoretical: The researchers establish a lower bound on the rate at which an analog signal with wide voltage fluctuations should be measured, or “sampled,” in order to ensure that it can be accurately digitized. Their work thus extends one of the several seminal results from longtime MIT Professor Claude Shannon’s groundbreaking 1948 paper “A Mathematical Theory of Communication,” the so-called Nyquist-Shannon sampling theorem.

Ayush Bhandari, a graduate student in media arts and sciences at MIT, is the first author on the paper, and he’s joined by his thesis advisor, Ramesh Raskar, an associate professor of media arts and sciences, and Felix Krahmer, an assistant professor of mathematics at the Technical University of Munich.


The researchers’ work was inspired by a new type of experimental ADC that captures not the voltage of a signal but its “modulo.” In the case of the new ADCs, the modulo is the remainder produced when the voltage of an analog signal is divided by the ADC’s maximum voltage.

“The idea is very simple,” Bhandari says. “If you have a number that is too big to store in your computer memory, you can take the modulo of the number. The act of taking the modulo is just to store the remainder.”

“The modulo architecture is also called the self-reset ADC,” Bhandari explains. “By self-reset, what it means is that when the voltage crosses some threshold, it resets, which is actually implementing a modulo. The self-reset ADC sensor was proposed in electronic architecture a couple years back, and ADCs that have this capability have been prototyped.”

One of those prototypes was designed to capture information about the firing of neurons in the mouse brain. The baseline voltage across a neuron is relatively low, and the sudden voltage spikes when the neuron fires are much higher. It’s difficult to build a sensor that is sensitive enough to detect the baseline voltage but won’t saturate during spikes.

When a signal exceeds the voltage limit of a self-reset ADC, it’s cut off, and it starts over again at the circuit’s minimum voltage. Similarly, if the signal drops below the circuit’s minimum voltage, it’s reset to the maximum voltage. If the signal’s peak voltage is several times the voltage limit, the signal can thus wrap around on itself again and again.

This poses a problem for digitization. Digitization is the process of sampling an analog signal — essentially, making many discrete measurements of its voltage. The Nyquist-Shannon theorem establishes the number of measurements required to ensure that the signal can be accurately reconstructed.

But existing sampling algorithms assume that the signal varies continuously up and down. If, in fact, the signal from a self-reset ADC is sampled right before it exceeds the maximum, and again right after the circuit resets, it looks to the standard sampling algorithm like a signal whose voltage decreases between the two measurements, rather than one whose voltage increases.

Big mistakes

Bhandari and his colleagues were interested in the theoretical question of how many samples are required to resolve that ambiguity, and the practical question of how to reconstruct the original signal. They found that the number of samples dictated by the Nyquist-Shannon theorem, multiplied by pi and by Euler’s number e, or roughly 8.5, would guarantee faithful reconstruction.

The researchers’ reconstruction algorithm relies on some clever mathematics. In a self-reset ADC, the voltage sampled after a reset is the modulo of the true voltage. Recovering the true voltage is thus a matter of adding some multiple of the ADC’s maximum voltage — call it M — to the sampled value. What that multiple should be, however — M, 2M, 5M, 10M — is unknown.

The most basic principle in calculus is that of the derivative, which provides a formula for calculating the slope of a curve at any given point. In computer science, however, derivatives are often approximated arithmetically. Suppose, for instance, that you have a series of samples from an analog signal. Take the difference between samples 1 and 2, and store it. Then take the difference between samples 2 and 3, and store that, then 3 and 4, and so on. The end result will be a string of values that approximate the derivative of the sampled signal.

The derivative of the true signal to a self-reset ADC is thus equal to the derivative of its modulo plus the derivative of a bunch of multiples of the threshold voltage — the Ms, 2Ms, 5Ms, and so on. But the derivative of the M-multiples is itself always a string of M-multiples, because taking the difference between two consecutive M-multiples will always yield another M-multiple.

Now, if you take the modulo of both derivatives, all the M-multiples disappear, since they leave no remainder when divided by M. The modulo of the derivative of the true signal is thus equivalent to the modulo of the derivative of the modulo signal.

Inverting the derivative is also one of the most basic operations in calculus, but deducing the original signal does require adding in an M-multiple whose value has to be inferred. Fortunately, using the wrong M-multiple will yield signal voltages that are wildly implausible. The researchers’ proof of their theoretical result involved an argument about the number of samples necessary to guarantee that the correct M-multiple can be inferred.

“If you have the wrong constant, then the constant has to be wrong by a multiple of M,” Krahmer says. “So if you invert the derivative, that adds up very quickly. One sample will be correct, the next sample will be wrong by M, the next sample will be wrong by 2M, and so on. We need to set the number of samples to make sure that if we have the wrong answer in the previous step, our reconstruction would grow so large that we know it can’t be correct.”

“Unlimited sampling is an intriguing concept that addresses the important and real issue of saturation in analog-to-digital converters,” says Richard Baraniuk, a professor of electrical and computer engineering at Rice University and one of the co-inventors of the single-pixel camera. “It is promising that the computations required to recover the signal from modulo measurements are practical with today’s hardware. Hopefully this concept will spur the development of the kind of sampling hardware needed to make unlimited sampling a reality.”

Designing microstructures

Today’s 3-D printers have a resolution of 600 dots per inch, which means that they could pack a billion tiny cubes of different materials into a volume that measures just 1.67 cubic inches.

Such precise control of printed objects’ microstructure gives designers commensurate control of the objects’ physical properties — such as their density or strength, or the way they deform when subjected to stresses. But evaluating the physical effects of every possible combination of even just two materials, for an object consisting of tens of billions of cubes, would be prohibitively time consuming.

So researchers at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) have developed a new design system that catalogues the physical properties of a huge number of tiny cube clusters. These clusters can then serve as building blocks for larger printable objects. The system thus takes advantage of physical measurements at the microscopic scale, while enabling computationally efficient evaluation of macroscopic designs.

“Conventionally, people design 3-D prints manually,” says Bo Zhu, a postdoc at CSAIL and first author on the paper. “But when you want to have some higher-level goal — for example, you want to design a chair with maximum stiffness or design some functional soft [robotic] gripper — then intuition or experience is maybe not enough. Topology optimization, which is the focus of our paper, incorporates the physics and simulation in the design loop. The problem for current topology optimization is that there is a gap between the hardware capabilities and the software. Our algorithm fills that gap.”

Zhu and his MIT colleagues presented their work this week at Siggraph, the premier graphics conference. Joining Zhu on the paper are Wojciech Matusik, an associate professor of electrical engineering and computer science; Mélina Skouras, a postdoc in Matusik’s group; and Desai Chen, a graduate student in electrical engineering and computer science.

Points in space

The MIT researchers begin by defining a space of physical properties, in which any given microstructure will assume a particular location. For instance, there are three standard measures of a material’s stiffness: One describes its deformation in the direction of an applied force, or how far it can be compressed or stretched; one describes its deformation in directions perpendicular to an applied force, or how much its sides bulge when it’s squeezed or contract when it’s stretched; and the third measures its response to shear, or a force that causes different layers of the material to shift relative to each other.

Those three measures define a three-dimensional space, and any particular combination of them defines a point in that space.

In the jargon of 3-D printing, the microscopic cubes from which an object is assembled are called voxels, for volumetric pixels; they’re the three-dimensional analogue of pixels in a digital image. The building blocks from which Zhu and his colleagues assemble larger printable objects are clusters of voxels.

In their experiments, the researchers considered clusters of three different sizes — 16, 32, and 64 voxels to a face. For a given set of printable materials, they randomly generate clusters that combine those materials in different ways: a square of material A at the cluster’s center, a border of vacant voxels around that square, material B at the corners, or the like. The clusters must be printable, however; it wouldn’t be possible to print a cluster that, say, included a cube of vacant voxels with a smaller cube of material floating at its center.

For each new cluster, the researchers evaluate its physical properties using physics simulations, which assign it a particular point in the space of properties.

Gradually, the researchers’ algorithm explores the entire space of properties, through both random generation of new clusters and the principled modification of clusters whose properties are known. The end result is a cloud of points that defines the space of printable clusters.

Establishing boundaries

The next step is to calculate a function called the level set, which describes the shape of the point cloud. This enables the researchers’ system to mathematically determine whether a cluster with a particular combination of properties is printable or not.

The final step is the optimization of the object to be printed, using software custom-developed by the researchers. That process will result in specifications of material properties for tens or even hundreds of thousands of printable clusters. The researchers’ database of evaluated clusters may not contain exact matches for any of those specifications, but it will contain clusters that are extremely good approximations.

“The design and discovery of structures to produce materials and objects with exactly specified functional properties is central for a large number of applications where mechanical properties are important, such as in the automotive or aerospace industries,” says Bernd Bickel, an assistant professor of computer science at the Institute of Science and Technology Austria and head of the institute’s Computer Graphics and Digital Fabrication group. “Due to the complexity of these structures, which, in the case of 3-D printing, can consist of more than a trillion material droplets, exploring them manually is absolutely intractable.”

“The solution presented by Bo and colleagues addresses this problem in a very clever way, by reformulating it,” he says. “Instead of working directly on the scale of individual droplets, they first precompute the behavior of small structures and put it in a database. Leveraging this knowledge, they can perform the actual optimization on a coarser level, allowing them to very efficiently generate high-resolution printable structures with more than a trillion elements, even with just a regular computer. This opens up exciting new avenues for designing and optimizing structures at a resolution that was out of reach so far.”

The MIT researchers’ work was supported by the U.S. Defense Advanced Research Projects Agency’s SIMPLEX program.

Phytoplankton and chips on Computer

Microbes mediate the global marine cycles of elements, modulating atmospheric carbon dioxide and helping to maintain the oxygen we all breathe, yet there is much about them scientists still don’t understand. Now, an award from the Simons Foundation will give researchers from MIT’s Darwin Project access to bigger, better computing resources to model these communities and probe how they work.

The simulations of plankton populations made by Darwin Project researchers have become increasingly computationally demanding. MIT Professor Michael “Mick” Follows and Principal Research Engineer Christopher Hill, both affiliates of the Darwin Project, were therefore delighted to learn of their recent Simons Foundation award, providing them with enhanced compute infrastructure to help execute the simulations of ocean circulation, biogeochemical cycles, and microbial population dynamics that are the bread and butter of their research.

The Darwin Project, an alliance between oceanographers and microbiologists in the MIT Department of Earth, Atmospheric and Planetary Sciences (EAPS) and the Parsons Lab in the MIT Department of Civil and Environmental Engineering, was conceived as an initiative to “advance the development and application of novel models of marine microbes and microbial communities, identifying the relationships of individuals and communities to their environment, connecting cellular-scale processes to global microbial community structure” with the goal of coupling “state of the art physical models of global ocean circulation with biogeochemistry and genome-informed models of microbial processes.”

In response to increases in model complexity and resolution over the course of past decade since the project’s inception in 2007, computational demands have ballooned. Increased fidelity and algorithmic sophistication in both biological and fluid dynamical component models and forays into new statistical analysis approaches, leveraging big-data innovations to analyze the simulations and field data, have grown inexorably.

“The award allows us to grow our in-house computational and data infrastructure to accelerate and facilitate these new modeling capabilities,” says Hill, who specializes in Earth and planetary computational science.

The boost in computational infrastructure the award provides for will advance several linked areas of research, including the capacity to model marine microbial systems in more detail, enhanced fidelity of the modeled fluid dynamical environment, support for state of the art data analytics including machine learning techniques, and accelerating and extending genomic data processing capabilities.

High diversity is a ubiquitous aspect of marine microbial communities that is not fully understood and, to date, is rarely resolved in simulations. Darwin Project researchers have broken new ground and continue to push the envelope in modeling in this area: In addition to resolving a much larger number of phenotypes and interactions than has typically been attempted by other investigators, the Darwin Project team has also been increasing the fidelity of the underlying physiological sub-models which define traits and trade-offs.

“One thing we are doing is implementing simplified metabolic models which resolve additional constraints [electron and energy conservation] and higher fidelity [dynamic representations of macro-molecular and elemental composition],” says Fellows. “These advances require more state variables per phenotype. We have also an explicit radiative transfer model that allows us to better exploit satellite remote sensing data but both come at a greater computational expense.” Darwin researchers are also expanding their models to resolve not only phototrophic and grazer communities in the surface ocean, but to include heterotrophic and chemo-autotrophic populations throughout the water column.

Follows and Hill believe these advances will provide better fidelity to real world observations, a more dynamic and fundamental description of marine microbial communities and biogeochemical cycles, and the potential to examine the underlying drivers and significance of diversity in the system.

“Much of the biological action in the surface ocean occurs at scales currently unresolved in most biogeochemical simulations,” Follows explains. “Numerical models and recent observations show that the sub-mesoscale motions in the ocean have a profound impact on the supply of resources to the surface and the dispersal and communication between different populations. The integral impact of this, and how to properly parameterize it, is not yet clear, but one approach, that is within reach, is to resolve these scales of motion nested within global simulations,”

Hill and Follows hope such advances will allow them to examine both local and regionally integrated effects of fine-scale physical drivers. “We have already completed a full annual cycle numerical simulation that resolves physical processes down to kilometer scales globally,” says Hill. “Such simulations provide a basis for driving targeted modeling of, for example, the role of fronts that may involve fully non-hydrostatic dynamics and that could help explain in-situ measurements that suggest enhanced growth rates under such conditions.” Such work is strongly complementary to another Simons Foundation sponsored project, the Simons Collaboration on Ocean Processes and Ecology (SCOPE). As an initiative to advance our understanding of the biology, ecology, and biogeochemistry of microbial processes that dominate Earth’s largest biome — the global ocean — SCOPE seeks to measure, model, and conduct experiments at a model ecosystem site located 100 km north of the Hawaiian island of Oahu that is representative of a large portion of the North Pacific Ocean.

The team has also already implemented algorithms to enable explicit modeling of the relevant fluid dynamics, but here too, the approaches are computationally demanding. “The improved facilities this award provides will enable these extremely demanding experiments to proceed,” says Follows.

Enhanced computer resources will also allow Darwin Project researchers to more effectively utilize data analytics. “We are adopting multiple statistical approaches for classifying fluid dynamical and ecosystem features in observations and in simulations which we plan to apply to biogeochemical problems,” says Hill. “One current direction, which employs random forest classification to identify features corresponding to training sets, is showing particular promise for objectively quantifying links between biogeochemical event occurrence and physical environment phenomena.”

Not only will these methods provide useful analysis tools for their simulations, the pair also see them bridging to real world interpretations of, for example, metagenomics surveys in the ocean. Follows and Hill see this direction as a route by which to bring simulations and observations closer in new and meaningful ways. The growth in computational infrastructure the Simons award allows for, creates the potential for making much larger queries across more realistic datasets.

The Darwin Project is part of a long and fruitful collaboration with Institute Professor Sally “Penny” Chisholm of MIT’s Department of Civil and Environmental Engineering. Steady growth in available large-scale metagenomic and single-cell genomic data resulting from genetics data activities in the Chisholm Lab are also driving additional computational processing resource needs.

With the new Simons-supported enhancements in computational infrastructure, Darwin Project collaborators in the Chisholm Lab will be able to tackle assembly from larger metagenomic libraries and single-cell genome phylogenies using maximum likelihood and/or Bayesian algorithms. Currently, some large metagenomics assembly activities require compute resources with more memory than this team has readily had available. “Single-cell genome phylogeny activities are computationally demanding and require dedicating compute resources for weeks or months at a time, Hill explains. “This creates a bottleneck for other work. To accelerate work in these areas additional compute resources, some with larger memory than current resources and some with GPU accelerators are going to be hugely beneficial. The new systems will permit larger metagenomics library assembly than is currently possible.”

The new AI algorithm monitors with radio waves

More than 50 million Americans suffer from sleep disorders, and diseases including Parkinson’s and Alzheimer’s can also disrupt sleep. Diagnosing and monitoring these conditions usually requires attaching electrodes and a variety of other sensors to patients, which can further disrupt their sleep.

To make it easier to diagnose and study sleep problems, researchers at MIT and Massachusetts General Hospital have devised a new way to monitor sleep stages without sensors attached to the body. Their device uses an advanced artificial intelligence algorithm to analyze the radio signals around the person and translate those measurements into sleep stages: light, deep, or rapid eye movement (REM).

“Imagine if your Wi-Fi router knows when you are dreaming, and can monitor whether you are having enough deep sleep, which is necessary for memory consolidation,” says Dina Katabi, the Andrew and Erna Viterbi Professor of Electrical Engineering and Computer Science, who led the study. “Our vision is developing health sensors that will disappear into the background and capture physiological signals and important health metrics, without asking the user to change her behavior in any way.”

Katabi worked on the study with Matt Bianchi, chief of the Division of Sleep Medicine at MGH, and Tommi Jaakkola, the Thomas Siebel Professor of Electrical Engineering and Computer Science and a member of the Institute for Data, Systems, and Society at MIT. Mingmin Zhao, an MIT graduate student, is the paper’s first author, and Shichao Yue, another MIT graduate student, is also a co-author.

Remote sensing

Katabi and members of her group in MIT’s Computer Science and Artificial Intelligence Laboratory have previously developed radio-based sensors that enable them to remotely measure vital signs and behaviors that can be indicators of health. These sensors consist of a wireless device, about the size of a laptop computer, that emits low-power radio frequency (RF) signals. As the radio waves reflect off of the body, any slight movement of the body alters the frequency of the reflected waves. Analyzing those waves can reveal vital signs such as pulse and breathing rate.

“It’s a smart Wi-Fi-like box that sits in the home and analyzes these reflections and discovers all of these changes in the body, through a signature that the body leaves on the RF signal,” Katabi says.

Katabi and her students have also used this approach to create a sensor called WiGait that can measure walking speed using wireless signals, which could help doctors predict cognitive decline, falls, certain cardiac or pulmonary diseases, or other health problems.

After developing those sensors, Katabi thought that a similar approach could also be useful for monitoring sleep, which is currently done while patients spend the night in a sleep lab hooked up to monitors such as electroencephalography (EEG) machines.

“The opportunity is very big because we don’t understand sleep well, and a high fraction of the population has sleep problems,” says Zhao. “We have this technology that, if we can make it work, can move us from a world where we do sleep studies once every few months in the sleep lab to continuous sleep studies in the home.”

To achieve that, the researchers had to come up with a way to translate their measurements of pulse, breathing rate, and movement into sleep stages. Recent advances in artificial intelligence have made it possible to train computer algorithms known as deep neural networks to extract and analyze information from complex datasets, such as the radio signals obtained from the researchers’ sensor. However, these signals have a great deal of information that is irrelevant to sleep and can be confusing to existing algorithms. The MIT researchers had to come up with a new AI algorithm based on deep neural networks, which eliminates the irrelevant information.

“The surrounding conditions introduce a lot of unwanted variation in what you measure. The novelty lies in preserving the sleep signal while removing the rest,” says Jaakkola. Their algorithm can be used in different locations and with different people, without any calibration.

Using this approach in tests of 25 healthy volunteers, the researchers found that their technique was about 80 percent accurate, which is comparable to the accuracy of ratings determined by sleep specialists based on EEG measurements.

“Our device allows you not only to remove all of these sensors that you put on the person, and make it a much better experience that can be done at home, it also makes the job of the doctor and the sleep technologist much easier,” Katabi says. “They don’t have to go through the data and manually label it.”

Sleep deficiencies

Other researchers have tried to use radio signals to monitor sleep, but these systems are accurate only 65 percent of the time and mainly determine whether a person is awake or asleep, not what sleep stage they are in. Katabi and her colleagues were able to improve on that by training their algorithm to ignore wireless signals that bounce off of other objects in the room and include only data reflected from the sleeping person.

The researchers now plan to use this technology to study how Parkinson’s disease affects sleep.

“When you think about Parkinson’s, you think about it as a movement disorder, but the disease is also associated with very complex sleep deficiencies, which are not very well understood,” Katabi says.

The sensor could also be used to learn more about sleep changes produced by Alzheimer’s disease, as well as sleep disorders such as insomnia and sleep apnea. It may also be useful for studying epileptic seizures that happen during sleep, which are usually difficult to detect.

40 years of internet, the world changed forever

Towards the end of the summer of 1969 – a few weeks after the moon landings, a few days after Woodstock, and a month before the first broadcast of Monty Python’s Flying Circus – a large grey metal box was delivered to the office of Leonard Kleinrock, a professor at the University of California in Los Angeles. It was the same size and shape as a household refrigerator, and outwardly, at least, it had about as much charm. But Kleinrock was thrilled: a photograph from the time shows him standing beside it, in requisite late-60s brown tie and brown trousers, beaming like a proud father.

Had he tried to explain his excitement to anyone but his closest colleagues, they probably wouldn’t have understood. The few outsiders who knew of the box’s existence couldn’t even get its name right: it was an IMP, or “interface message processor”, but the year before, when a Boston company had won the contract to build it, its local senator, Ted Kennedy, sent a telegram praising its ecumenical spirit in creating the first “interfaith message processor”. Needless to say, though, the box that arrived outside Kleinrock’s office wasn’t a machine capable of fostering understanding among the great religions of the world. It was much more important than that.

It’s impossible to say for certain when the internet began, mainly because nobody can agree on what, precisely, the internet is. (This is only partly a philosophical question: it is also a matter of egos, since several of the people who made key contributions are anxious to claim the credit.) But 29 October 1969 – 40 years ago next week – has a strong claim for being, as Kleinrock puts it today, “the day the infant internet uttered its first words”. At 10.30pm, as Kleinrock’s fellow professors and students crowded around, a computer was connected to the IMP, which made contact with a second IMP, attached to a second computer, several hundred miles away at the Stanford Research Institute, and an undergraduate named Charley Kline tapped out a message. Samuel Morse, sending the first telegraph message 125 years previously, chose the portentous phrase: “What hath God wrought?” But Kline’s task was to log in remotely from LA to the Stanford machine, and there was no opportunity for portentousness: his instructions were to type the command LOGIN.

To say that the rest is history is the emptiest of cliches – but trying to express the magnitude of what began that day, and what has happened in the decades since, is an undertaking that quickly exposes the limits of language. It’s interesting to compare how much has changed in computing and the internet since 1969 with, say, how much has changed in world politics. Consider even the briefest summary of how much has happened on the global stage since 1969: the Vietnam war ended; the cold war escalated then declined; the Berlin Wall fell; communism collapsed; Islamic fundamentalism surged. And yet nothing has quite the power to make people in their 30s, 40s or 50s feel very old indeed as reflecting upon the growth of the internet and the world wide web. Twelve years after Charley Kline’s first message on the Arpanet, as it was then known, there were still only 213 computers on the network; but 14 years after that, 16 million people were online, and email was beginning to change the world; the first really usable web browser wasn’t launched until 1993, but by 1995 we had Amazon, by 1998 Google, and by 2001, Wikipedia, at which point there were 513 million people online. Today the figure is more like 1.7 billion.

Unless you are 15 years old or younger, you have lived through the dotcom bubble and bust, the birth of Friends Reunited and Craigslist and eBay and Facebook andTwitter, blogging, the browser wars, Google Earth, filesharing controversies, the transformation of the record industry, political campaigning, activism and campaigning, the media, publishing, consumer banking, the pornography industry, travel agencies, dating and retail; and unless you’re a specialist, you’ve probably only been following the most attention-grabbing developments. Here’s one of countless statistics that are liable to induce feelings akin to vertigo: on New Year’s Day 1994 – only yesterday, in other words – there were an estimated 623 websites. In total. On the whole internet. “This isn’t a matter of ego or crowing,” says Steve Crocker, who was present that day at UCLA in 1969, “but there has not been, in the entire history of mankind, anything that has changed so dramatically as computer communications, in terms of the rate of change.”

Looking back now, Kleinrock and Crocker are both struck by how, as young computer scientists, they were simultaneously aware that they were involved in something momentous and, at the same time, merely addressing a fairly mundane technical problem. On the one hand, they were there because of the Russian Sputnik satellite launch, in 1957, which panicked the American defence establishment, prompting Eisenhower to channel millions of dollars into scientific research, and establishing Arpa, the Advanced Research Projects Agency, to try to win the arms technology race. The idea was “that we would not get surprised again,” said Robert Taylor, the Arpa scientist who secured the money for the Arpanet, persuading the agency’s head to give him a million dollars that had been earmarked for ballistic missile research. With another pioneer of the early internet, JCR Licklider, Taylor co-wrote the paper, “The Computer As A Communication Device”, which hinted at what was to come. “In a few years, men will be able to communicate more effectively through a machine than face to face,” they declared. “That is rather a startling thing to say, but it is our conclusion.”

On the other hand, the breakthrough accomplished that night in 1969 was a decidedly down-to-earth one. The Arpanet was not, in itself, intended as some kind of secret weapon to put the Soviets in their place: it was simply a way to enable researchers to access computers remotely, because computers were still vast and expensive, and the scientists needed a way to share resources. (The notion that the network was designed so that it would survive a nuclear attack is an urban myth, though some of those involved sometimes used that argument to obtain funding.) The technical problem solved by the IMPs wasn’t very exciting, either. It was already possible to link computers by telephone lines, but it was glacially slow, and every computer in the network had to be connected, by a dedicated line, to every other computer, which meant you couldn’t connect more than a handful of machines without everything becoming monstrously complex and costly. The solution, called “packet switching” – which owed its existence to the work of a British physicist, Donald Davies – involved breaking data down into blocks that could be routed around any part of the network that happened to be free, before getting reassembled at the other end.

“I thought this was important, but I didn’t really think it was as challenging as what I thought of as the ‘real research’,” says Crocker, a genial Californian, now 65, who went on to play a key role in the expansion of the internet. “I was particularly fascinated, in those days, by artificial intelligence, and by trying to understand how people think. I thought that was a much more substantial and respectable research topic than merely connecting up a few machines. That was certainly useful, but it wasn’t art.”

Still, Kleinrock recalls a tangible sense of excitement that night as Kline sat down at the SDS Sigma 7 computer, connected to the IMP, and at the same time made telephone contact with his opposite number at Stanford. As his colleagues watched, he typed the letter L, to begin the word LOGIN.

“Have you got the L?” he asked, down the phone line. “Got the L,” the voice at Stanford responded.

Kline typed an O. “Have you got the O?”

“Got the O,” Stanford replied.

Kline typed a G, at which point the system crashed, and the connection was lost. The G didn’t make it through, which meant that, quite by accident, the first message ever transmitted across the nascent internet turned out, after all, to be fittingly biblical:


Frenzied visions of a global conscious brain

One of the most intriguing things about the growth of the internet is this: to a select group of technological thinkers, the surprise wasn’t how quickly it spread across the world, remaking business, culture and politics – but that it took so long to get off the ground. Even when computers were mainly run on punch-cards and paper tape, there were whispers that it was inevitable that they would one day work collectively, in a network, rather than individually. (Tracing the origins of online culture even further back is some people’s idea of an entertaining game: there are those who will tell you that the Talmud, the book of Jewish law, contains a form of hypertext, the linking-and-clicking structure at the heart of the web.) In 1945, the American presidential science adviser, Vannevar Bush, was already imagining the “memex”, a device in which “an individual stores all his books, records, and communications”, which would be linked to each other by “a mesh of associative trails”, like weblinks. Others had frenzied visions of the world’s machines turning into a kind of conscious brain. And in 1946, an astonishingly complete vision of the future appeared in the magazine Astounding Science Fiction. In a story entitled A Logic Named Joe, the author Murray Leinster envisioned a world in which every home was equipped with a tabletop box that he called a “logic”:

“You got a logic in your house. It looks like a vision receiver used to, only it’s got keys instead of dials and you punch the keys for what you wanna get . . . you punch ‘Sally Hancock’s Phone’ an’ the screen blinks an’ sputters an’ you’re hooked up with the logic in her house an’ if somebody answers you got a vision-phone connection. But besides that, if you punch for the weather forecast [or] who was mistress of the White House durin’ Garfield’s administration . . . that comes on the screen too. The relays in the tank do it. The tank is a big buildin’ full of all the facts in creation . . . hooked in with all the other tanks all over the country . . . The only thing it won’t do is tell you exactly what your wife meant when she said, ‘Oh, you think so, do you?’ in that peculiar kinda voice “

Despite all these predictions, though, the arrival of the internet in the shape we know it today was never a matter of inevitability. It was a crucial idiosyncracy of the Arpanet that its funding came from the American defence establishment – but that the millions ended up on university campuses, with researchers who embraced an anti-establishment ethic, and who in many cases were committedly leftwing; one computer scientist took great pleasure in wearing an anti-Vietnam badge to a briefing at the Pentagon. Instead of smothering their research in the utmost secrecy – as you might expect of a cold war project aimed at winning a technological battle against Moscow – they made public every step of their thinking, in documents known as Requests For Comments.

Deliberately or not, they helped encourage a vibrant culture of hobbyists on the fringes of academia – students and rank amateurs who built their own electronic bulletin-board systems and eventually FidoNet, a network to connect them to each other. An argument can be made that these unofficial tinkerings did as much to create the public internet as did the Arpanet. Well into the 90s, by the time the Arpanet had been replaced by NSFNet, a larger government-funded network, it was still the official position that only academic researchers, and those affiliated to them, were supposed to use the network. It was the hobbyists, making unofficial connections into the main system, who first opened the internet up to allcomers.

What made all of this possible, on a technical level, was simultaneously the dullest-sounding and most crucial development since Kleinrock’s first message. This was the software known as TCP/IP, which made it possible for networks to connect to other networks, creating a “network of networks”, capable of expanding virtually infinitely – which is another way of defining what the internet is. It’s for this reason that the inventors of TCP/IP, Vint Cerf and Bob Kahn, are contenders for the title of fathers of the internet, although Kleinrock, understandably, disagrees. “Let me use an analogy,” he says. “You would certainly not credit the birth of aviation to the invention of the jet engine. The Wright Brothers launched aviation. Jet engines greatly improved things.”

The spread of the internet across the Atlantic, through academia and eventually to the public, is a tale too intricate to recount here, though it bears mentioning that British Telecom and the British government didn’t really want the internet at all: along with other European governments, they were in favour of a different networking technology, Open Systems Interconnect. Nevertheless, by July 1992, an Essex-born businessman named Cliff Stanford had opened Demon Internet, Britain’s first commercial internet service provider. Officially, the public still wasn’t meant to be connecting to the internet. “But it was never a real problem,” Stanford says today. “The people trying to enforce that weren’t working very hard to make it happen, and the people working to do the opposite were working much harder.” The French consulate in London was an early customer, paying Demon £10 a month instead of thousands of pounds to lease a private line to Paris from BT.

After a year or so, Demon had between 2,000 and 3,000 users, but they weren’t always clear why they had signed up: it was as if they had sensed the direction of the future, in some inchoate fashion, but hadn’t thought things through any further than that. “The question we always got was: ‘OK, I’m connected – what do I do now?'” Stanford recalls. “It was one of the most common questions on our support line. We would answer with ‘Well, what do you want to do? Do you want to send an email?’ ‘Well, I don’t know anyone with an email address.’ People got connected, but they didn’t know what was meant to happen next.”

Fortunately, a couple of years previously, a British scientist based at Cern, the physics laboratory outside Geneva, had begun to answer that question, and by 1993 his answer was beginning to be known to the general public. What happened next was the web.

The birth of the web

I sent my first email in 1994, not long after arriving at university, from a small, under-ventilated computer room that smelt strongly of sweat. Email had been in existence for decades by then – the @ symbol was introduced in 1971, and the first message, according to the programmer who sent it, Ray Tomlinson, was “something like QWERTYUIOP”. (The test messages, Tomlinson has said, “were entirely forgettable, and I have, therefore, forgotten them”.) But according to an unscientific poll of friends, family and colleagues, 1994 seems fairly typical: I was neither an early adopter nor a late one. A couple of years later I got my first mobile phone, which came with two batteries: a very large one, for normal use, and an extremely large one, for those occasions on which you might actually want a few hours of power. By the time I arrived at the Guardian, email was in use, but only as an add-on to the internal messaging system, operated via chunky beige terminals with green-on-black screens. It took for ever to find the @ symbol on the keyboard, and I don’t remember anything like an inbox, a sent-mail folder, or attachments. I am 34 years old, but sometimes I feel like Methuselah.

I have no recollection of when I first used the world wide web, though it was almost certainly when people still called it the world wide web, or even W3, perhaps in the same breath as the phrase “information superhighway”, made popular by Al Gore. (Or “infobahn”: did any of us really, ever, call the internet the “infobahn”?) For most of us, though, the web is in effect synonymous with the internet, even if we grasp that in technical terms that’s inaccurate: the web is simply a system that sits on top of the internet, making it greatly easier to navigate the information there, and to use it as a medium of sharing and communication. But the distinction rarely seems relevant in everyday life now, which is why its inventor, Tim Berners-Lee, has his own legitimate claim to be the progenitor of the internet as we know it. The first ever website was his own, at CERN:

The idea that a network of computers might enable a specific new way of thinking about information, instead of just allowing people to access the data on each other’s terminals, had been around for as long as the idea of the network itself: it’s there in Vannevar Bush’s memex, and Murray Leinster’s logics. But the grandest expression of it was Project Xanadu, launched in 1960 by the American philosopher Ted Nelson, who imagined – and started to build – a vast repository for every piece of writing in existence, with everything connected to everything else according to a principle he called “transclusion”. It was also, presciently, intended as a method for handling many of the problems that would come to plague the media in the age of the internet, automatically channelling small royalties back to the authors of anything that was linked. Xanadu was a mind-spinning vision – and at least according to an unflattering portrayal by Wired magazine in 1995, over which Nelson threatened to sue, led those attempting to create it into a rabbit-hole of confusion, backbiting and “heart-slashing despair”. Nelson continues to develop Xanadu today, arguing that it is a vastly superior alternative to the web. “WE FIGHT ON,” the Xanadu website declares, sounding rather beleaguered, not least since the declaration is made on a website.

Web browsers crossed the border into mainstream use far more rapidly than had been the case with the internet itself: Mosaic launched in 1993 and Netscape followed soon after, though it was an embarrassingly long time before Microsoft realised the commercial necessity of getting involved at all. Amazon and eBay were online by 1995. And in 1998 came Google, offering a powerful new way to search the proliferating mass of information on the web. Until not too long before Google, it had been common for search or directory websites to boast about how much of the web’s information they had indexed – the relic of a brief period, hilarious in hindsight, when a user might genuinely have hoped to check all the webpages that mentioned a given subject. Google, and others, saw that the key to the web’s future would be helping users exclude almost everything on any given topic, restricting search results to the most relevant pages.

Without most of us quite noticing when it happened, the web went from being a strange new curiosity to a background condition of everyday life: I have no memory of there being an intermediate stage, when, say, half the information I needed on a particular topic could be found online, while the other half still required visits to libraries. “I remember the first time I saw a web address on the side of a truck, and I thought, huh, OK, something’s happening here,” says Spike Ilacqua, who years beforehand had helped found The World, the first commercial internet service provider in the US. Finally, he stopped telling acquaintances that he worked in “computers”, and started to say that he worked on “the internet”, and nobody thought that was strange.

It is absurd – though also unavoidable here – to compact the whole of what happened from then onwards into a few sentences: the dotcom boom, the historically unprecedented dotcom bust, the growing “digital divide”, and then the hugely significant flourishing, over the last seven years, of what became known as Web 2.0. It is only this latter period that has revealed the true capacity of the web for “generativity”, for the publishing of blogs by anyone who could type, for podcasting and video-sharing, for the undermining of totalitarian regimes, for the use of sites such as Twitter and Facebook to create (and ruin) friendships, spread fashions and rumours, or organise political resistance. But you almost certainly know all this: it’s part of what these days, in many parts of the world, we call “just being alive”.

The most confounding thing of all is that in a few years’ time, all this stupendous change will probably seem like not very much change at all. As Crocker points out, when you’re dealing with exponential growth, the distance from A to B looks huge until you get to point C, whereupon the distance between A and B looks like almost nothing; when you get to point D, the distance between B and C looks similarly tiny. One day, presumably, everything that has happened in the last 40 years will look like early throat-clearings — mere preparations for whatever the internet is destined to become. We will be the equivalents of the late-60s computer engineers, in their horn-rimmed glasses, brown suits, and brown ties, strange, period-costume characters populating some dimly remembered past.

The Terminal Bloomberg Professional Services

Creation Myth by Malcolm Gladwell

Xerox PARC, Apple, and the truth about innovation.


In late 1979, a twenty-four-year-old entrepreneur paid a visit to a research center in Silicon Valley called Xerox PARC. He was the co-founder of a small computer startup down the road, in Cupertino. His name was Steve Jobs.

Xerox PARC was the innovation arm of the Xerox Corporation. It was, and remains, on Coyote Hill Road, in Palo Alto, nestled in the foothills on the edge of town, in a long, low concrete building, with enormous terraces looking out over the jewels of Silicon Valley. To the northwest was Stanford University’s Hoover Tower. To the north was Hewlett-Packard’s sprawling campus. All around were scores of the other chip designers, software firms, venture capitalists, and hardware-makers. A visitor to PARC, taking in that view, could easily imagine that it was the computer world’s castle, lording over the valley below—and, at the time, this wasn’t far from the truth. In 1970, Xerox had assembled the world’s greatest computer engineers and programmers, and for the next ten years they had an unparalleled run of innovation and invention. If you were obsessed with the future in the seventies, you were obsessed with Xerox PARC—which was why the young Steve Jobs had driven to Coyote Hill Road.

Apple was already one of the hottest tech firms in the country. Everyone in the Valley wanted a piece of it. So Jobs proposed a deal: he would allow Xerox to buy a hundred thousand shares of his company for a million dollars—its highly anticipated I.P.O. was just a year away—if PARC would “open its kimono.” A lot of haggling ensued. Jobs was the fox, after all, and PARC was the henhouse. What would he be allowed to see? What wouldn’t he be allowed to see? Some at PARC thought that the whole idea was lunacy, but, in the end, Xerox went ahead with it. One PARC scientist recalls Jobs as “rambunctious”—a fresh-cheeked, caffeinated version of today’s austere digital emperor. He was given a couple of tours, and he ended up standing in front of a Xerox Alto, PARC’s prized personal computer.

An engineer named Larry Tesler conducted the demonstration. He moved the cursor across the screen with the aid of a “mouse.” Directing a conventional computer, in those days, meant typing in a command on the keyboard. Tesler just clicked on one of the icons on the screen. He opened and closed “windows,” deftly moving from one task to another. He wrote on an elegant word-processing program, and exchanged e-mails with other people at PARC, on the world’s first Ethernet network. Jobs had come with one of his software engineers, Bill Atkinson, and Atkinson moved in as close as he could, his nose almost touching the screen. “Jobs was pacing around the room, acting up the whole time,” Tesler recalled. “He was very excited. Then, when he began seeing the things I could do onscreen, he watched for about a minute and started jumping around the room, shouting, ‘Why aren’t you doing anything with this? This is the greatest thing. This is revolutionary!’”

Xerox began selling a successor to the Alto in 1981. It was slow and underpowered—and Xerox ultimately withdrew from personal computers altogether. Jobs, meanwhile, raced back to Apple, and demanded that the team working on the company’s next generation of personal computers change course. He wanted menus on the screen. He wanted windows. He wanted a mouse. The result was the Macintosh, perhaps the most famous product in the history of Silicon Valley.

“If Xerox had known what it had and had taken advantage of its real opportunities,” Jobs said, years later, “it could have been as big as I.B.M. plus Microsoft plus Xerox combined—and the largest high-technology company in the world.”

This is the legend of Xerox PARC. Jobs is the Biblical Jacob and Xerox is Esau, squandering his birthright for a pittance. In the past thirty years, the legend has been vindicated by history. Xerox, once the darling of the American high-technology community, slipped from its former dominance. Apple is now ascendant, and the demonstration in that room in Palo Alto has come to symbolize the vision and ruthlessness that separate true innovators from also-rans. As with all legends, however, the truth is a bit more complicated.


After Jobs returned from PARC, he met with a man named Dean Hovey, who was one of the founders of the industrial-design firm that would become known as IDEO. “Jobs went to Xerox PARC on a Wednesday or a Thursday, and I saw him on the Friday afternoon,” Hovey recalled. “I had a series of ideas that I wanted to bounce off him, and I barely got two words out of my mouth when he said, ‘No, no, no, you’ve got to do a mouse.’ I was, like, ‘What’s a mouse?’ I didn’t have a clue. So he explains it, and he says, ‘You know, [the Xerox mouse] is a mouse that cost three hundred dollars to build and it breaks within two weeks. Here’s your design spec: Our mouse needs to be manufacturable for less than fifteen bucks. It needs to not fail for a couple of years, and I want to be able to use it on Formica and my bluejeans.’ From that meeting, I went to Walgreens, which is still there, at the corner of Grant and El Camino in Mountain View, and I wandered around and bought all the underarm deodorants that I could find, because they had that ball in them. I bought a butter dish. That was the beginnings of the mouse.”

I spoke with Hovey in a ramshackle building in downtown Palo Alto, where his firm had started out. He had asked the current tenant if he could borrow his old office for the morning, just for the fun of telling the story of the Apple mouse in the place where it was invented. The room was the size of someone’s bedroom. It looked as if it had last been painted in the Coolidge Administration. Hovey, who is lean and healthy in a Northern California yoga-and-yogurt sort of way, sat uncomfortably at a rickety desk in a corner of the room. “Our first machine shop was literally out on the roof,” he said, pointing out the window to a little narrow strip of rooftop, covered in green outdoor carpeting. “We didn’t tell the planning commission. We went and got that clear corrugated stuff and put it across the top for a roof. We got out through the window.”

He had brought a big plastic bag full of the artifacts of that moment: diagrams scribbled on lined paper, dozens of differently sized plastic mouse shells, a spool of guitar wire, a tiny set of wheels from a toy train set, and the metal lid from a jar of Ralph’s preserves. He turned the lid over. It was filled with a waxlike substance, the middle of which had a round indentation, in the shape of a small ball. “It’s epoxy casting resin,” he said. “You pour it, and then I put Vaseline on a smooth steel ball, and set it in the resin, and it hardens around it.” He tucked the steel ball underneath the lid and rolled it around the tabletop. “It’s a kind of mouse.”

The hard part was that the roller ball needed to be connected to the housing of the mouse, so that it didn’t fall out, and so that it could transmit information about its movements to the cursor on the screen. But if the friction created by those connections was greater than the friction between the tabletop and the roller ball, the mouse would skip. And the more the mouse was used the more dust it would pick up off the tabletop, and the more it would skip. The Xerox PARC mouse was an elaborate affair, with an array of ball bearings supporting the roller ball. But there was too much friction on the top of the ball, and it couldn’t deal with dust and grime.

At first, Hovey set to work with various arrangements of ball bearings, but nothing quite worked. “This was the ‘aha’ moment,” Hovey said, placing his fingers loosely around the sides of the ball, so that they barely touched its surface. “So the ball’s sitting here. And it rolls. I attribute that not to the table but to the oldness of the building. The floor’s not level. So I started playing with it, and that’s when I realized: I want it to roll. I don’t want it to be supported by all kinds of ball bearings. I want to just barely touch it.”

The trick was to connect the ball to the rest of the mouse at the two points where there was the least friction—right where his fingertips had been, dead center on either side of the ball. “If it’s right at midpoint, there’s no force causing it to rotate. So it rolls.”

Hovey estimated their consulting fee at thirty-five dollars an hour; the whole project cost perhaps a hundred thousand dollars. “I originally pitched Apple on doing this mostly for royalties, as opposed to a consulting job,” he recalled. “I said, ‘I’m thinking fifty cents apiece,’ because I was thinking that they’d sell fifty thousand, maybe a hundred thousand of them.” He burst out laughing, because of how far off his estimates ended up being. ‘s pretty savvy. He said no. Maybe if I’d asked for a nickel, I would have been fine.”


Here is the first complicating fact about the Jobs visit. In the legend of Xerox PARC, Jobs stole the personal computer from Xerox. But the striking thing about Jobs’s instructions to Hovey is that he didn’t want to reproduce what he saw at PARC. “You know, there were disputes around the number of buttons—three buttons, two buttons, one-button mouse,” Hovey went on. “The mouse at Xerox had three buttons. But we came around to the fact that learning to mouse is a feat in and of itself, and to make it as simple as possible, with just one button, was pretty important.”

So was what Jobs took from Xerox the idea of the mouse? Not quite, because Xerox never owned the idea of the mouse. The PARC researchers got it from the computer scientist Douglas Engelbart, at Stanford Research Institute, fifteen minutes away on the other side of the university campus. Engelbart dreamed up the idea of moving the cursor around the screen with a stand-alone mechanical “animal” back in the mid- nineteen-sixties. His mouse was a bulky, rectangular affair, with what looked like steel roller-skate wheels. If you lined up Engelbart’s mouse, Xerox’s mouse, and Apple’s mouse, you would not see the serial reproduction of an object. You would see the evolution of a concept.

The same is true of the graphical user interface that so captured Jobs’s imagination. Xerox PARC’s innovation had been to replace the traditional computer command line with onscreen icons. But when you clicked on an icon you got a pop-up menu: this was the intermediary between the user’s intention and the computer’s response. Jobs’s software team took the graphical interface a giant step further. It emphasized “direct manipulation.” If you wanted to make a window bigger, you just pulled on its corner and made it bigger; if you wanted to move a window across the screen, you just grabbed it and moved it. The Apple designers also invented the menu bar, the pull-down menu, and the trash can—all features that radically simplified the original Xerox PARC idea.

The difference between direct and indirect manipulation—between three buttons and one button, three hundred dollars and fifteen dollars, and a roller ball supported by ball bearings and a free-rolling ball—is not trivial. It is the difference between something intended for experts, which is what Xerox PARC had in mind, and something that’s appropriate for a mass audience, which is what Apple had in mind. PARC was building a personal computer. Apple wanted to build apopular computer.

In a recent study, “The Culture of Military Innovation,” the military scholar Dima Adamsky makes a similar argument about the so-called Revolution in Military Affairs. R.M.A. refers to the way armies have transformed themselves with the tools of the digital age—such as precision-guided missiles, surveillance drones, and real-time command, control, and communications technologies—and Adamsky begins with the simple observation that it is impossible to determine who invented R.M.A. The first people to imagine how digital technology would transform warfare were a cadre of senior military intellectuals in the Soviet Union, during the nineteen-seventies. The first country to come up with these high-tech systems was the United States. And the first country to use them was Israel, in its 1982 clash with the Syrian Air Force in Lebanon’s Bekaa Valley, a battle commonly referred to as “the Bekaa Valley turkey shoot.” Israel coördinated all the major innovations of R.M.A. in a manner so devastating that it destroyed nineteen surface-to-air batteries and eighty-seven Syrian aircraft while losing only a handful of its own planes.

That’s three revolutions, not one, and Adamsky’s point is that each of these strands is necessarily distinct, drawing on separate skills and circumstances. The Soviets had a strong, centralized military bureaucracy, with a long tradition of theoretical analysis. It made sense that they were the first to understand the military implications of new information systems. But they didn’t do anything with it, because centralized military bureaucracies with strong intellectual traditions aren’t very good at connecting word and deed.

The United States, by contrast, has a decentralized, bottom-up entrepreneurial culture, which has historically had a strong orientation toward technological solutions. The military’s close ties to the country’ high-tech community made it unsurprising that the U.S. would be the first to invent precision-guidance and next-generation command-and-control communications. But those assets also meant that Soviet-style systemic analysis wasn’t going to be a priority. As for the Israelis, their military culture grew out of a background of resource constraint and constant threat. In response, they became brilliantly improvisational and creative. But, as Adamsky points out, a military built around urgent, short-term “fire extinguishing” is not going to be distinguished by reflective theory. No one stole the revolution. Each party viewed the problem from a different perspective, and carved off a different piece of the puzzle.

In the history of the mouse, Engelbart was the Soviet Union. He was the visionary, who saw the mouse before anyone else did. But visionaries are limited by their visions. “Engelbart’s self-defined mission was not to produce a product, or even a prototype; it was an open-ended search for knowledge,” Matthew Hiltzik writes, in “Dealers of Lightning” (1999), his wonderful history of Xerox PARC. “Consequently, no project in his lab ever seemed to come to an end.” Xerox PARC was the United States: it was a place where things got made. “Xerox created this perfect environment,” recalled Bob Metcalfe, who worked there through much of the nineteen-seventies, before leaving to found the networking company 3Com. “There wasn’t any hierarchy. We built out our own tools. When we needed to publish papers, we built a printer. When we needed to edit the papers, we built a computer. When we needed to connect computers, we figured out how to connect them. We had big budgets. Unlike many of our brethren, we didn’t have to teach. We could just research. It was heaven.”

But heaven is not a good place to commercialize a product. “We built a computer and it was a beautiful thing,” Metcalfe went on. “We developed our computer language, our own display, our own language. It was a gold-plated product. But it cost sixteen thousand dollars, and it needed to cost three thousand dollars.” For an actual product, you need threat and constraint—and the improvisation and creativity necessary to turn a gold-plated three-hundred-dollar mouse into something that works on Formica and costs fifteen dollars. Apple was Israel.

Xerox couldn’t have been I.B.M. and Microsoft combined, in other words. “You can be one of the most successful makers of enterprise technology products the world has ever known, but that doesn’t mean your instincts will carry over to the consumer market,” the tech writer Harry McCracken recently wrote. “They’re really different, and few companies have ever been successful in both.” He was talking about the decision by the networking giant Cisco System, this spring, to shut down its Flip camera business, at a cost of many hundreds of millions of dollars. But he could just as easily have been talking about the Xerox of forty years ago, which was one of the most successful makers of enterprise technology the world has ever known. The fair question is whether Xerox, through its research arm in Palo Alto, found a better way to be Xerox—and the answer is that it did, although that story doesn’t get told nearly as often.


One of the people at Xerox PARC when Steve Jobs visited was an optical engineer named Gary Starkweather. He is a solid and irrepressibly cheerful man, with large, practical hands and the engineer’s gift of pretending that what is impossibly difficult is actually pretty easy, once you shave off a bit here, and remember some of your high-school calculus, and realize that the thing that you thought should go in left to right should actually go in right to left. Once, before the palatial Coyote Hill Road building was constructed, a group that Starkweather had to be connected to was moved to another building, across the Foothill Expressway, half a mile away. There was no way to run a cable under the highway. So Starkweather fired a laser through the air between the two buildings, an improvised communications system that meant that, if you were driving down the Foothill Expressway on a foggy night and happened to look up, you might see a mysterious red beam streaking across the sky. When a motorist drove into the median ditch, “we had to turn it down,” Starkweather recalled, with a mischievous smile.

Lasers were Starkweather’s specialty. He started at Xerox’s East Coast research facility in Webster, New York, outside Rochester. Xerox built machines that scanned a printed page of type using a photographic lens, and then printed a duplicate. Starkweather’s idea was to skip the first step—to run a document from a computer directly into a photocopier, by means of a laser, and turn the Xerox machine into a printer. It was a radical idea. The printer, since Gutenberg, had been limited to the function of re-creation: if you wanted to print a specific image or letter, you had to have a physical character or mark corresponding to that image or letter. What Starkweather wanted to do was take the array of bits and bytes, ones and zeros that constitute digital images, and transfer them straight into the guts of a copier. That meant, at least in theory, that he could print anything.

“One morning, I woke up and I thought, Why don’t we just print something out directly?” Starkweather said. “But when I flew that past my boss he thought it was the most brain-dead idea he had ever heard. He basically told me to find something else to do. The feeling was that lasers were too expensive. They didn’t work that well. Nobody wants to do this, computers aren’t powerful enough. And I guess, in my naïveté, I kept thinking, He’s just not right—there’s something about this I really like. It got to be a frustrating situation. He and I came to loggerheads over the thing, about late 1969, early 1970. I was running my experiments in the back room behind a black curtain. I played with them when I could. He threatened to lay off my people if I didn’t stop. I was having to make a decision: do I abandon this, or do I try and go up the ladder with it?”

Then Starkweather heard that Xerox was opening a research center in Palo Alto, three thousand miles away from its New York headquarters. He went to a senior vice-president of Xerox, threatening to leave for I.B.M. if he didn’t get a transfer. In January of 1971, his wish was granted, and, within ten months, he had a prototype up and running.

Starkweather is retired now, and lives in a gated community just north of Orlando, Florida. When we spoke, he was sitting at a picnic table, inside a screened-in porch in his back yard. Behind him, golfers whirred by in carts. He was wearing white chinos and a shiny black short-sleeved shirt, decorated with fluorescent images of vintage hot rods. He had brought out two large plastic bins filled with the artifacts of his research, and he spread the contents on the table: a metal octagonal disk, sketches on lab paper, a black plastic laser housing that served as the innards for one of his printers.

“There was still a tremendous amount of opposition from the Webster group, who saw no future in computer printing,” he went on. “They said, ‘I.B.M. is doing that. Why do we need to do that?’ and so forth. Also, there were two or three competing projects, which I guess I have the luxury of calling ridiculous. One group had fifty people and another had twenty. I had two.” Starkweather picked up a picture of one of his in-house competitors, something called an “optical carriage printer.” It was the size of one of those modular Italian kitchen units that you see advertised in fancy design magazines. “It was an unbelievable device,” he said, with a rueful chuckle. “It had a ten-inch drum, which turned at five thousand r.p.m., like a super washing machine. It had characters printed on its surface. I think they only ever sold ten of them. The problem was that it was spinning so fast that the drum would blow out and the characters would fly off. And there was only this one lady in Troy, New York, who knew how to put the characters on so that they would stay.

“So we finally decided to have what I called a fly-off. There was a full page of text—where some of them were non-serif characters, Helvetica, stuff like that—and then a page of graph paper with grid lines, and pages with pictures and some other complex stuff—and everybody had to print all six pages. Well, once we decided on those six pages, I knew I’d won, because I knew there wasn’t anything I couldn’t print. Are you kidding? If you can translate it into bits, I can print it. Some of these other machines had to go through hoops just to print a curve. A week after the fly-off, they folded those other projects. I was the only game in town.” The project turned into the Xerox 9700, the first high-speed, cut-paper laser printer in the world.


In one sense, the Starkweather story is of a piece with the Steve Jobs visit. It is an example of the imaginative poverty of Xerox management. Starkweather had to hide his laser behind a curtain. He had to fight for his transfer to PARC. He had to endure the indignity of the fly-off, and even then Xerox management remained skeptical. The founder of PARC, Jack Goldman, had to bring in a team from Rochester for a personal demonstration. After that, Starkweather and Goldman had an idea for getting the laser printer to market quickly: graft a laser onto a Xerox copier called the 7000. The 7000 was an older model, and Xerox had lots of 7000s sitting around that had just come off lease. Goldman even had a customer ready: the Lawrence Livermore laboratory was prepared to buy a whole slate of the machines. Xerox said no. Then Starkweather wanted to make what he called a photo-typesetter, which produced camera-ready copy right on your desk. Xerox said no. “I wanted to work on higher-performance scanners,” Starkweather continued. “In other words, what if we print something other than documents? For example, I made a high-resolution scanner and you could print on glass plates.” He rummaged in one of the boxes on the picnic table and came out with a sheet of glass, roughly six inches square, on which a photograph of a child’s face appeared. The same idea, he said, could have been used to make “masks” for the semiconductor industry—the densely patterned screens used to etch the designs on computer chips. “No one would ever follow through, because Xerox said, ‘Now you’re in Intel’s market, what are you doing that for?’ They just could not seem to see that they were in the information business. This”—he lifted up the plate with the little girl’s face on it— “is a copy. It’s just not a copy of an office document.” But he got nowhere. “Xerox had been infested by a bunch of spreadsheet experts who thought you could decide every product based on metrics. Unfortunately, creativity wasn’t on a metric.”

A few days after that afternoon in his back yard, however, Starkweather e-mailed an addendum to his discussion of his experiences at PARC. “Despite all the hassles and risks that happened in getting the laser printer going, in retrospect the journey was that much more exciting,” he wrote. “Often difficulties are just opportunities in disguise.” Perhaps he felt that he had painted too negative a picture of his time at Xerox, or suffered a pang of guilt about what it must have been like to be one of those Xerox executives on the other side of the table. The truth is that Starkweather was a difficult employee. It went hand in hand with what made him such an extraordinary innovator. When his boss told him to quit working on lasers, he continued in secret. He was disruptive and stubborn and independent-minded—and he had a thousand ideas, and sorting out the good ideas from the bad wasn’t always easy. Should Xerox have put out a special order of laser printers for Lawrence Livermore, based on the old 7000 copier? In “Fumbling the Future: How Xerox Invented, Then Ignored, the First Personal Computer” (1988)—a book dedicated to the idea that Xerox was run by the blind—Douglas Smith and Robert Alexander admit that the proposal was hopelessly impractical: “The scanty Livermore proposal could not justify the investment required to start a laser printing business…. How and where would Xerox manufacture the laser printers? Who would sell and service them? Who would buy them and why?” Starkweather, and his compatriots at Xerox PARC, weren’t the source of disciplined strategic insights. They were wild geysers of creative energy.

The psychologist Dean Simonton argues that this fecundity is often at the heart of what distinguishes the truly gifted. The difference between Bach and his forgotten peers isn’t necessarily that he had a better ratio of hits to misses. The difference is that the mediocre might have a dozen ideas, while Bach, in his lifetime, created more than a thousand full-fledged musical compositions. A genius is a genius, Simonton maintains, because he can put together such a staggering number of insights, ideas, theories, random observations, and unexpected connections that he almost inevitably ends up with something great. “Quality,” Simonton writes, is “a probabilistic function of quantity.”

Simonton’s point is that there is nothing neat and efficient “The more successes there are,” he says, “the more failures there are as well” — meaning that the person who had far more ideas than the rest of us will have far more bad ideas than the rest of us, too. This is why managing the creative process is so difficult. The making of the classic Rolling Stones album “Exile on Main Street” was an ordeal, Keith Richards writes in his new memoir, because the band had too many ideas. It had to fight from under an avalanche of mediocrity: “Head in the Toilet Blues,” “Leather Jackets,” “Windmill,” “I Was Just a Country Boy,” “Bent Green Needles,” “Labour Pains,” and “Pommes de Terre”—the last of which Richards explains with the apologetic, “Well, we were in France at the time.”

At one point, Richards quotes a friend, Jim Dickinson, remembering the origins of the song “Brown Sugar”:

I watched Mick write the lyrics. . . . He wrote it down as fast as he could move his hand. I’d never seen anything like it. He had one of those yellow legal pads, and he’d write a verse a page, just write a verse and then turn the page, and when he had three pages filled, they started to cut it. It was amazing.

Richards goes on to marvel, “It’s unbelievable how prolific he was.” Then he writes, “Sometimes you’d wonder how to turn the fucking tap off. The odd times he would come out with so many lyrics, you’re crowding the airwaves, boy.” Richards clearly saw himself as the creative steward of the Rolling Stones (only in a rock-and-roll band, by the way, can someone like Keith Richards perceive himself as the responsible one), and he came to understand that one of the hardest and most crucial parts of his job was to “turn the fucking tap off,” to rein in Mick Jagger’s incredible creative energy.

The more Starkweather talked, the more apparent it became that his entire career had been a version of this problem. Someone was always trying to turn his tap off. But someone had to turn his tap off: the interests of the innovator aren’t perfectly aligned with the interests of the corporation. Starkweather saw ideas on their own merits. Xerox was a multinational corporation, with shareholders, a huge sales force, and a vast corporate customer base, and it needed to consider every new idea within the context of what it already had.

Xerox’s managers didn’t always make the right decisions when they said no to Starkweather. But he got to PARC, didn’t he? And Xerox, to its great credit, had a PARC—a place where, a continent away from the top managers, an engineer could sit and dream, and get every purchase order approved, and fire a laser across the Foothill Expressway if he was so inclined. Yes, he had to pit his laser printer against lesser ideas in the contest. But he won the contest. And, the instant he did, Xerox cancelled the competing projects and gave him the green light.

“I flew out there and gave a presentation to them on what I was looking at,” Starkweather said of his first visit to PARC. “They really liked it, because at the time they were building a personal computer, and they were beside themselves figuring out how they were going to get whatever was on the screen onto a sheet of paper. And when I showed them how I was going to put prints on a sheet of paper it was a marriage made in heaven.” The reason Xerox invented the laser printer, in other words, is that it invented the personal computer. Without the big idea, it would never have seen the value of the small idea. If you consider innovation to be efficient and ideas precious, that is a tragedy: you give the crown jewels away to Steve Jobs, and all you’re left with is a printer. But in the real, messy world of creativity, giving away the thing you don’t really understand for the thing that you do is an inevitable tradeoff.

“When you have a bunch of smart people with a broad enough charter, you will always get something good out of it,” Nathan Myhrvold, formerly a senior executive at Microsoft, argues. “It’s one of the best investments you could possibly make—but only if you chose to value it in terms of successes. If you chose to evaluate it in terms of how many times you failed, or times you could have succeeded and didn’t, then you are bound to be unhappy. Innovation is an unruly thing. There will be some ideas that don’t get caught in your cup. But that’s not what the game is about. The game is what you catch, not what you spill.”

In the nineteen-nineties, Myhrvold created a research laboratory at Microsoft modelled in part on what Xerox had done in Palo Alto in the nineteen-seventies, because he considered PARC a triumph, not a failure. “Xerox did research outside their business model, and when you do that you should not be surprised that you have a hard time dealing with it—any more than if some bright guy at Pfizer wrote a word processor. Good luck to Pfizer getting into the word-processing business. Meanwhile, the thing that they invented that was similar to their own business—a really big machine that spit paper —they made a lot of money on it.” And so they did. Gary Starkweather’s laser printer made billions for Xerox. It paid for every other single project at Xerox PARC, many times over.


In 1988, Starkweather got a call from the head of one of Xerox’s competitors, trying to lure him away. It was someone whom he had met years ago. “The decision was painful,” he said. “I was a year from being a twenty-five-year veteran of the company. I mean, I’d done enough for Xerox that unless I burned the building down they would never fire me. But that wasn’t the issue. It’s about having ideas that are constantly squashed. So I said, ‘Enough of this,’ and I left.”

He had a good many years at his new company, he said. It was an extraordinarily creative place. He was part of decision-making at the highest level. “Every employee from technician to manager was hot for the new, exciting stuff,” he went on. “So, as far as buzz and daily environment, it was far and away the most fun I’ve ever had.” But it wasn’t perfect. “I remember I called in the head marketing guy and I said, ‘I want you to give me all the information you can come up with on when people buy one of our products—what software do they buy, what business are they in—so I can see the model of how people are using the machines.’ He looked at me and said, ‘I have no idea about that.’ ” Where was the rigor? Then Starkweather had a scheme for hooking up a high-resolution display to one of his new company’s computers. “I got it running and brought it into management and said, ‘Why don’t we show this at the tech expo in San Francisco? You’ll be able to rule the world.’ They said, ‘I don’t know. We don’t have room for it.’ It was that sort of thing. It was like me saying I’ve discovered a gold mine and you saying we can’t afford a shovel.”