The computing industry is at a crossroads. Moore’s law, the engine that powered the computing revolution, has finally run out of steam.
Against this pessimistic backdrop, we are entering a period of great optimism as machine learning systems replace algorithms and human expertise. We are scratching the surface on what machine learning can do, and the possibilities seem endless.
How do we make progress on the exponential trajectories we have grown accustomed to when the underlying computing hardware performance has flatlined? We have to start by breaking out of a computing model that dates back to 1945 — reassessing core assumptions behind the internal organization of a computer and rethinking the fundamental representations of code inside the machine, as algorithms are replaced by machine learning models.
The founder of Intel, Gordon Moore, made the observation in 1965 that integrated circuits were going to get cheaper, faster and smaller at an exponential rate (~2x every 18 months). This phenomenon created the modern microprocessor (computers on an integrated circuit chip) which very quickly became powerful enough to displace its larger rivals starting with minicomputers to mainframes and eventually supercomputers.
On the back of Moore’s law we came to think of computing hardware as a commodity. Almost like clockwork the hardware got exponentially better and with it the software.
No exponential force can last forever and today we have reached the limits of scaling Moore’s law. It was never a law but an observation that held up surprisingly well for 50 years, eventually succumbing to real physical laws that cannot be rewritten.
Moore’s law today looks like just another technology S-curve that has plateaued. The era of automatic improvements is over. We will have to find new creative ways to keep the progress going.
One of the legacies of Moore’s law is that it created a dominant model for computing based on a general-purpose architecture first proposed by John von Neumann in 1945. Modern machines today are still based on the von Neumann architecture — a remarkable and enduring feat for an industry that is used to so much change and disruption.
There were many attempts to break out of the von Neumann mold. Cray computers pioneered vector-processing based supercomputers that were well suited for numerically intensive applications such as weather forecasting and scientific simulations. Thinking Machines Corporation, a startup out of the MIT Media Lab, came to market with the Connection Machine — a computer based on a radical architecture modeled after the human brain. There were other notable attempts around Dataflow architectures that didn’t make it past academic projects.
All these ideas failed because of Moore’s law. Moore’s law put conventional microprocessors on a trajectory of exponential growth making it impossible for alternative designs to compete on equal footing. Even the most promising new designs quickly found their advantage erased as microprocessors got better at a faster rate. Great new technical designs were no match for the economics of mass production and commoditization.
By the mid-1990s, even the world’s fastest supercomputers were assembled from commodity microprocessors. This came to be known as “the attack of the killer micros” (based on a talk by Eugene Brooks of Lawrence Livermore Labs).
Eventually, everyone just gave up and a whole generation of engineers came to believe in the infallibility of Moore’s law. Computer architects shifted their focus to micro-architecture optimizations, further reinforcing the status-quo. Exploration of new ideas gave way to the exploitation of Moore’s law. No company exemplified this mindset more than Intel — the entire company on a mission to keep the Moore’s law gravy train running smoothly and on time.
The real tragedy of Moore’s law is that it bred a homogeneity in design, promoting inside-the-box thinking and rewarding good engineering more than great design.
With Moore’s law now history, we are back to exploration. How do we continue to make progress building ever faster and cheaper machines? A good starting point would be to study the failed computing designs of the past (of which there are many). Like the Renaissance artists who rediscovered the classics, the next generation of computer architects and programmers will have to draw inspiration from old ideas.
Nowhere is the need for a new architecture more apparent than in machine learning where the growing computational needs are exposing the limits of current microprocessors that aren’t getting any faster.
Machine learning is increasingly replacing traditional algorithms. Companies like Google are well on their way to remaking themselves into machine learning factories, systematically transforming human crafted code into machine learning based models. Jeff Dean, a highly respected early Google engineer, has gone as far as to say that if that Google was built today “much of it would not be coded but learned.”
As machine learning techniques get better we may soon reach a point where we will trust a machine learning based system more than human crafted software. A few among us are already there!
Some recent academic work by Tim Kraska at MIT shows that conventional data-structures (B-trees, Hashmaps etc.) can be rewritten as deep learning based models. These “learned” data-structures are as much as 70% faster and 10x more space efficient than their traditional counterparts. This is an exciting area of research and provides a glimpse of things to come. It is entirely possible that in the near future core parts of a database or an operating system are redesigned as machine learning systems.
All this is challenging the traditional notion of software as code. Increasingly software may better be described as a set of machine learning models. Machine learning changes application execution into a numerical problem, the “algorithm” itself represented as a set of numerical weights. The act of learning is essentially a numerical optimization problem where the weights are adjusted based on new data; inference (deciding and acting) boils down to simple matrix multiplication.
If traditional software needs general-purpose microprocessors, machine learning systems need fast calculators that can multiply and add large sets of numbers in parallel. This key observation has led Google to build custom hardware for machine learning that it calls the Tensor Processing Unit (TPU). The TPU looks nothing like a general-purpose microprocessor — it looks more like a limited but super fast calculating engine (Ironically the earliest electronic computers were also specialized calculators. Another example of history coming full circle).
The Google TPU is interesting on many levels:
Today the TPU looks more like an accelerator to the microprocessor with most of application processing still executing inside the microprocessor. If machine learning does eat software it is conceivable this model gets flipped: the TPU (or something like it) in the center of the action with most of the application running inside it as models, while the microprocessor is relegated to the role of a helper.
What will computing hardware look like in a post-Moore’s law world?
It was said the great computer architect Seymour Cray began every new computer design on a blank sheet of paper: each new design unique and pulled together from first principles — unencumbered by the mistakes and legacies of the past. By the time he passed away in 1996, his approach seemed quaint and impractical — the world firmly in the spell of Moore’s law and a procession of Intel microprocessors, every new generation an iterative improvement over the previous. Today 42 years after the first Cray computer came to market, we are entering a new phase of computing design that looks a lot like the early days of computing. It’s a new beginning and we could use a Seymour Cray in our midst.