Cloud-First is Over: The Rise of Compute Everywhere

You can see this kind of edge/cloud split discussed in live video analytics work, including Microsoft Research’s Rocket project.

Devices now decide

Across industrial and retail environments, devices that once forwarded raw measurements now filter, classify and act locally. Central systems still matter for aggregation, long-term analysis and retraining — but they’re no longer in the critical path for every decision the system makes.

Operational complexity

Here’s where the “compute everywhere” pitch gets fuzzy. The tooling evolved fast. Operating it is slower, harder work.

Deployment isn’t continuous anymore

Cloud deployments assume constant connectivity. Edge devices do not. Some synchronize once a day. Others disappear for weeks.

Updating software or models turns into a logistics problem: staged rollouts, health checks and the ability to stop or roll back when things go wrong. Those patterns show up explicitly in job-based fleet update mechanisms — for example, AWS IoT jobs.

Partial failures are normal

In fleets of thousands of devices, something is always broken. Power issues, network partitions, hardware variation and firmware bugs create a steady state of partial failure.

Observability is harder, too. A silent device might be offline — or dead. Distinguishing between the two requires explicit design, often based on heartbeats and deadlines rather than continuous metrics.

Fleet diversity

Over time, edge fleets drift. Hardware revisions, firmware versions and configuration exceptions accumulate. A model that works on most devices fails on a minority due to subtle differences no one documented.

Maintaining homogeneity becomes an operational necessity, not an aesthetic preference.

How teams actually decide what runs where

The teams that navigate this transition well don’t start with an “edge strategy.” They start by asking uncomfortable questions about their workload.

  • Where does the data originate, and what does it cost to move? Data gravity usually matters more than latency. If data is generated at the edge, shipping models outward is often cheaper and simpler than pulling raw data back to the cloud.
  • What constraints are non-negotiable? Physics sets latency floors. Regulations restrict data movement. Power and connectivity shape what you can assume about availability. When one of these forces compute outward, it’s better to accept it early than fight it later.
  • What are you actually optimizing for? I’ve seen teams push inference to the edge in the name of “latency” when their application could tolerate hundreds of milliseconds. The result was a large increase in operational complexity with no user-visible benefit. Measure what actually matters before you distribute anything.
  • Can you operate it? This is the question teams skip. Running edge infrastructure requires skills many cloud-native organizations don’t have: embedded systems experience, fleet management and tolerance for intermittent connectivity. If you can’t reliably update devices or reason about partial failures, keeping workloads centralized is often the safer choice.

The new default

Compute everywhere isn’t a new layer you bolt onto an existing architecture. It’s a change in what teams assume by default.

The cloud didn’t become irrelevant. It stopped being the reflexive answer to every placement question.

Organizations that navigate this well don’t frame the problem as edge versus cloud. They treat the device-to-cloud continuum as a design space and make explicit choices within it. Inference runs close to where data is generated. Training and coordination stay centralized, where aggregation pays off. Analytics lives where global visibility actually adds value.

What surprised me wasn’t that teams moved compute out of the cloud. It was how rarely they did it because they wanted to — and how often they did it because they had to.

This article is published as part of the Foundry Expert Contributor Network.
Want to join?

Related Posts

Leave a Comment