Cloud and the Coming Upheaval
What we are witnessing today with the shift to the cloud is truly unprecedented, and represents the largest threat to the established software players in recent memory. Ramu Arunachalam

Few will dispute that cloud computing represents a massive transformation and technology shift. Everyone seems to agree it’s a big deal, but it’s hard to find consensus on the winners and losers from this shift. As a venture investor, I find myself drawn into debates: Will Oracle prevail over newer database architectures? Can VMware stay relevant in the new cloud infrastructure market? Can Cisco continue to dominate the networking market? Are SaaS pioneers going to be disrupted by SaaS v2.0?

Clearly, these are tough questions—the answers to which only become apparent in hindsight. Part of the challenge is that not all innovation is created equal. Some innovations are disruptive, while others only end up reinforcing the status quo. Further, in many instances, incumbent firms do successfully innovate in response to market and competitive pressures.

One useful framework to reason about these changes is to tease out the nature of technological innovation and assess its impact on the incumbent firms. I am going to do just that, and argue that what we are witnessing today with the shift to the cloud is truly unprecedented and represents the largest threat to the established software players in recent memory. In fact, I can’t think of a previous era where there were so many existential threats to the prevailing order.

I’ll make my case with the following four points:

  1. The shift to the cloud represents an unprecedented wholesale architectural shift of the entire software stack.
  2. Distributed computing is at the core of this architectural shift.
  3. This computing paradigm represents a competence destroying discontinuity for established firms.
  4. This architecture is quickly becoming the norm for all applications (and not just large web-scale services).
Massive and total shift of the entire software stack

If you look inside modern cloud applications (sometimes referred to as a cloud native app) you will discover inside:

  • A new networking architecture
  • A new storage architecture
  • A new database architecture
  • A new set of tools for provisioning, systems management and monitoring
  • A new application architecture consisting of a large number of loosely coupled micro-services
  • Applications developed, tested and deployed using a whole new suite of tools and software development processes

It is a dramatic architectural overhaul, but to be clear you can’t characterize it as radical—that is to say it’s not new science or new inventions driving these changes. For instance, datacenter networking is seeing a shift to software-defined networking, but the core transport protocols are still based on the familiar TCP/IP and Ethernet standards. The dominant design is still the packet switched network invented in the 1960s. At the same time, it’s not incremental innovation either, such as going from a 1Gbps Cisco switch to a 10Gbps switch would be. A software-defined network with commodity Linux based white-box switches is clearly more than incremental—the brains of the network have been pulled out of the hardware and have moved into software sitting at the edges of the network.

So what is it? It’s not incremental and it’s not radical. It is innovation involving a new design that changes how individual components interact with each other while preserving the core building blocks. Economists call this architectural innovation.

Before we delve into this, it’s worth mentioning that similar architectural shifts are happening in storage (software-­‐defined storage from commodity disks using existing SCSI/iSCSI protocols), in databases (Hadoop, NoSQL largely based on the RDBMS architecture invented in 1970s), and in applications (micro-­‐services based applications developed using standard programming languages and frameworks that run on a Linux based operating system).

Architectural innovation is not new and is a topic that has been studied extensively. I’ll point to a seminal paper by Henderson and Clarke [“Architectural Innovation: The Reconfiguration of Existing Product Technologies and the Failure of Established Firms”, 1990] that has helped broaden our understanding of architectural innovations and its impact on established firms. In the paper, the authors conclude:

… architectural innovations destroy the usefulness of the architectural knowledge of established firms, and that since architectural knowledge tends to become embedded in the structure and information-­processing procedures of established organizations, this destruction is difficult for firms to recognize and hard to correct. Architectural innovation therefore presents established organizations with subtle challenges that may have significant competitive implications.

Xerox is presented as a case study. It was a market leader in the plain paper copier market, having practically invented the category. When competitors introduced smaller and more reliable copiers, Xerox was slow to react because the smaller copiers were based on a new product architecture. Even though from a technical standpoint Xerox had all the capabilities and engineering expertise, it took 8 years,after numerous failed attempts and missteps, for the company to finally bring a product to market. By then it had lost half its market share.

I’ll summarize by noting:

  • Architectural innovation requires “out of box” thinking that is hard for established firms to come up with. The core competency of these firms lay in refining the existing architecture and not in overturning it.
  • Architectural innovation often looks deceptively simple and can easily be conflated with incremental/modular innovation. At inception it is often the case that the benefits from the new architecture can be provided by shoehorning changes into an existing framework or architecture.
  • Even when confronted with architectural innovation, established firms have a hard time reacting. The new architecture requires a new skill set that is not easily found inside the organization. Further building out the new competency is hard inside an organization that is steeped in the old way of doing things.

Hopefully I’ve established both the wholesale architectural shift we are witnessing with the move to the cloud and the disruptive nature of this shift.

This wholesale architectural shift is truly unprecedented. In fact, it’s hard to find another time where so much was changing so quickly. The shift from client/server to the web was profound and revolutionary but I would argue that it was relatively less disruptive to incumbents than shift to the cloud is turning out to be.

The following table makes the contrast across the enterprise software stack between the client/server to web platform shift and the web to cloud platform shift.

Blurring lines between partnership and competition
Distributed systems are at the core of this architectural shift

The unifying theme across this architectural shift is the widespread use of a computing paradigm called distributed systems.

A distributed system is a collection of autonomous computers that work in concert to get a job done (a job could be any task, such as returning web search results to an user, providing movie recommendations, executing an e-­‐commerce transaction, etc.). Distributed systems represent a radical departure from the traditional software model of a monolithic application that runs on a single computer.

Ask any computer scientist and they will tell you how hard it is to build a distributed system. First, building a distributed system requires careful algorithm design and intricate software engineering to make sure all independent computers stay coordinated and work in lockstep. Second, the only way to build a large distributed system cost-­‐effectively is with cheap commodity hardware components, but cheap commodity hardware is prone to failure and so you now have the additional complexity of building reliable software from unreliable hardware. In a distributed system, you have to assume failure as the norm – computing nodes can fail, network links can fail, messages between computers can get lost or corrupted – and in the midst of all this chaos, the system has to drive consensus among all the communicating entities to provide correct results. For a programmer, it requires shifting to a new frame of reasoning in a strange universe where everything is probabilistic and nothing can be measured with certainty.

If building these systems is hard, so is testing, managing, and maintaining them. It has taken companies like Google, Facebook, and Netflix years of experimentation, both in development and operations, to get to a stable architecture that can handle the growing needs of their business. The innovation has not only been in the form of new software but also in new quality assurance/operational practices and even organizational design – all with the goal of efficiently and seamlessly delivering a reliable, high quality service to their users.

It should be obvious at this point that no sane organization moves to a distributed systems architecture unless there’s a compelling reason to do so. As it turns out, distributed systems are the only way to cost-­‐effectively harness the supercomputing power required to run modern web services that support millions of users and operate on petabytes of data. It’s no exaggeration to say that distributed systems have made Google, Amazon and Facebook possible. It is an architectural innovation that has made these companies economically viable and continues to be a source of their competitive advantage.

Distributed computing - ­a competence destroying discontinuity for established firms

If you talk to the engineers at Nicira (a pioneer in software-­defined networking) and ask them about what they’re really good at, the answer may surprise you. They will tell you their core competency is in distributed systems and not networking. Nicira took networking and reduced it to a distributed systems problem. Obviously the team needed to know enough networking to be dangerous but their competitive advantage lay in their expertise in designing and building distributed systems.

Strikingly, if you speak to the engineers at Nutanix (a pioneer in software-defined storage) you will get a similar answer highlighting their novel distributed system architecture as a competitive advantage.

It is almost inconceivable to imagine any of these technologies being developed inside Cisco or EMC. The new architecture would be considered heresy and the organizational antibodies would waste no time in rejecting the new design as flawed and inconsequential. Over the years, both these firms have accumulated an unparalleled expertise in networking and storage respectively. Indeed, you can’t find a better place to learn about classical networking or storage. However, distributed systems is not in their wheelhouse. Even if someone gave them a blueprint to the new architecture, they wouldn’t have the necessary skill-­sets and knowledge to quickly develop a product from it.

We can see a similar pattern emerging on the server side. VMware has built a very profitable business selling its virtualization software. Its core hypervisor technology has been instrumental in helping enterprises consolidate application workloads and increase efficiency in their datacenters, but its technology was built for traditional monolithic applications. The new application architectures inspired from blueprints of the web-scale pioneers are built as set of micro-services using containers.

Containerization provides a simpler and more efficient alternative to hypervisors. VMware is thus faced with two challenges. First, its bread and butter hypervisor technology is at risk of obsolescence. Second, the new applications are distributed systems that require a new management and system stack to operate. VMware’s current management stack, built to serve the needs of siloed monolithic applications, is ill equipped to address the needs of these new applications.

The new architecture is quickly becoming the norm for all applications

Given the complexity inherent in distributed systems and the expertise required in building and managing them, it is natural to ask the question; Will this computing paradigm only be needed by a few large web companies (with millions of users to support)? In other words, are we talking about changes here that affect only a small percentage of the application software ecosystem?

The answer is no. The new architecture is here to stay and represents a fundamental shift in how all software will get developed going forward. Whether you are building a consumer web service for millions of users, or an enterprise application for tens of users, you will be pushed in the direction of a micro-services based distributed system. The reasons for this are twofold. The first is technological. Moore’s law has shifted from delivering faster microprocessors to more microprocessors packed into a single chip (referred to as multi-core). As a result, to take advantage of the newer hardware, application developers have to re-architect their software as a distributed system.

Second, in the era of big data, increasingly enterprise applications are not just about automating a set of workflows (payroll, resource planning, HR, customer relationship management etc.), but also about using data to uncover insights and drive smarter business decisions. We have already seen this on the consumer side. Imagine Netflix or Amazon without their recommendation engines, rendering them unable to draw inferences from past purchases and what others like you purchased to suggest new items?

A similar transformation is happening in the enterprise. For example, the CRM system has gone from managing your customer contacts to a system that can now tell you who your best customers are and who is likely to churn from your service.

The finance system is moving from just reporting the numbers and bookkeeping to providing benchmarking and recommendations on ways to increase efficiency and optimizing resources.

If there is anything we have learned from web-scale companies, it is that these data-driven services require a new database architecture that can process large amounts of data and run complex analytic models; If you end up changing the database architecture, you will need to re-architect the storage and networking as well to accommodate the heavy data processing needs of the application. As you work your way through all the ripple effects, you’ll end up with a completely new architecture from top to bottom that looks a whole lot like the architecture pioneered by existing web-scale companies.

The interesting question is: Can the existing application software vendors successfully adapt to these changes? I think it is going to be very challenging for them. For one, unlike in the past where new features could easily be accommodated within the existing architecture, we are talking about adding product capabilities that will force an architectural rethink on a legacy software code base, and for all the reasons we discussed previously, that will be hard if not impossible to do.
The disruption of on-premise software by software-as-a-service (SaaS) provides an interesting historical parallel. SaaS was a business model innovation but also an architectural innovation. Unlike on-premise software companies that had to manage multiple software versions and customizations across their customer base, SaaS was about using one version of the software hosted in the cloud to serve all customers.

The simplicity that came from developing and maintaining one code base allowed SaaS companies to innovate on features much faster than their on-premise counterparts. To compete, the on-premise vendors would have to not only change their architecture but also all of their software development and QA processes to match the speed and agility of SaaS. We all know how that story ended, but if SaaS was disruptive I think the new architecture we are seeing with the cloud is at least an order of magnitude more disruptive. The changes are more fundamental and run deeper through the entire stack than we ever saw with SaaS.

Conclusion

The shift to the cloud is shaping up to be unlike any other previous platform shift. New applications are driving a fundamentally new architecture powered by a distributed computing paradigm. The changes are profound and pervasive across the entire software and hardware stack. These architectural innovations are going to make life very hard for the incumbents while providing plenty of opportunities for startups to differentiate. It’s shaping up to be an exciting time, and I for one can’t think of a better time to be investing in startups!