Untangling the data center from complexity and human oversight

Our investment thesis at Khosla Ventures is that simplicity through abstraction and automation through autonomic behavior will rule in the enterprise’s “New Stack,” a concept that embraces several industry changes:

  • The move to distributed, open source centric, web-era stacks and architectures for new applications. New Stack examples include Apache web stacks, noSQL engines, Hadoop/Spark, etc. deployed on open source infrastructure such as Docker, Linux/KVM/Openstack and the like.

  • The emergence of DevOps (a role that that didn’t even exist 10 years ago) and general “developer velocity” as a priority: e.g. giving developers better control of infrastructure and the ability to rapidly build/deploy/manage services.

  • Cloud-style hardware infrastructure that provides cost and flexibility advantage of commodity compute pools in both private datacenters and public cloud services, giving enterprises the same benefits that Google and Facebook have gained through in-house efforts.

The most profound New Stack efficiency will come from radically streamlining developer and operator interactions with the entire application/infrastructure stack, and embracing new abstractions and automation concepts to hide complexity. The point isn’t to remove the humans from IT — it’s to remove humans from overseeing areas that are beyond human reasoning, and to simplify human interactions with complex systems.

The operation of today’s enterprise data centers is inefficient and unnecessarily complex because we have standardized on manual oversight. For example, in spite of vendors’ promises of automation, most applications and services today are manually placed on specific machines, as human operators reason across the entire infrastructure and address dynamic constraints like failure events, upgrades, traffic surges, resource contention and service levels.

The best practice in data center optimization for the last 10 years has been to take physical machines and carve them into virtual machines. This made sense when servers were big and applications were small and static. Virtual machines let us squeeze a lot of applications onto larger machines. But today’s applications have outgrown servers and now run across multitudes of nodes, on-premise or in the cloud. That’s more machines and more partitions for humans to reason with as they manage their growing pool of services. And the automation that enterprises try to script over this environment amounts to linear acceleration of existing manual processes and adds fragility on top of abstractions that are misfits for these new applications and the underlying cloud hardware model. Similarly, typical “cloud orchestration” vendor products increase complexity by layering on more management components that themselves need to be managed, instead of simplifying management.