← Back to home

Towards an AI-First Architecture: Nanoservices

· Mart van der Jagt

The non-deterministic nature of LLMs

Every previous evolution in software abstraction preserved one property: determinism. Byte code, programming languages, frameworks. Each raised the level at which the programmer thinks; none changed the fundamental contract that the same input produces the same output. Martin Fowler observed that LLMs break this pattern. They do not just raise the abstraction; they shift it sideways into non-determinism.

Non-determinism in the builder also creates a fork, depending on whether you accept it in the output. Accepting it gives you autonomy: agents that operate in ambiguity, where probabilistic behavior is the point. Rejecting it gives you acceleration: a non-deterministic tool harnessed to produce deterministic systems. Most current discourse blurs them, but they require different architectures. This article zooms in on the architecture required for developing deterministic systems, accelerated by AI.

Why the unit must shrink

The question we need to answer is not whether AI can build software, but what is the right unit for AI to build.

Eric Evans argues that non-deterministic generation must be separated from deterministic execution. That separation is a design principle. The constraint that follows is practical: the unit of generation must be small enough to verify completely.

If AI amplifies existing patterns, as the 2025 DORA Report found, and the existing pattern in software is decomposition into microservices, the structural consequence is that services get even smaller. Prigogine received the 1977 Nobel Prize for showing that systems driven far from equilibrium reach bifurcation points: thresholds where the existing structure can no longer hold and the system reorganizes into something qualitatively different. Software construction is approaching such a threshold. When AI is used not just to speed up the existing process, but to change who does the building, that changes the constraints the architecture must satisfy. The flux (demand for autonomous building) is increasing. The constraints (non-determinism, limited context windows, error compounding in large scope) make the microservice too large to verify autonomously. Recent research confirms this: AI performs better against small, bounded codebases, which is also what we see in practice every day. The system bifurcates toward a smaller unit. That unit is the nanoservice.

The nanoservice as anti-pattern

A microservice owns a business capability: a cohesive set of related operations. A nanoservice owns a single operation within that capability. Around fifteen years ago, nanoservices got popularized as an anti-pattern, as services whose overhead outweighs its utility due to poor performance, fragmented logic and development and management overhead. The granularity was technically possible but economically irrational.

Three constraints have been shifting. Network latency and bandwidth have improved dramatically since then, reducing the performance penalty that made fine-grained calls irrational. Monitoring and maintenance are increasingly automated; maturing platform engineering capabilities absorb operational overhead across services. And now, development overhead is collapsing; AI generates the boilerplate, and the conversion layers that would have dominated nanoservice codebases.

When the overhead drops below the utility threshold, the antipattern becomes a pattern.

The architecture of permanent absence

A framework for AI autonomy has been recently forming: Human in/on/out the loop. AI-first covers the last of these, with one addition: it requires designing for permanent absence, not temporary delegation. If you are out of the loop, the system must not need you back in.

A nanoservice owns one responsibility behind a contract, one boundary, one deployment unit. The contract defines what goes in and what comes out; the tests verify that the behavior holds. Implementation is internal; no human needs to inspect it. The boundary also constrains the agent’s scope. Since AI amplifies, it amplifies the risk of violating boundaries in any larger unit. A nanoservice limits what the agent can reach.

Nanoservice templates make this work at scale. It provides the scaffolding every nanoservice inherits: project structure, build pipeline, deployment configuration, observability, error handling. The agent operates within that scaffolding; it fills in the behavior. Meanwhile, templates enforce consistency across hundreds of services.

Verification replaces review. You specify the contract and the tests; the agent owns everything behind that boundary: implementation, build, deployment. You never read the code. You only verify the behavior. If a nanoservice fails its contract, you do not debug it; you replace it.

Nanoservices architecture diagram showing nanoservice responsibilities behind contracts, boundaries and deployment units.
Nanoservices architecture (CQRS): each handler, command API, query API, and event publisher is an independently deployable nanoservice.

The role of the software engineer

At the same time, the complexity doesn’t disappear. It moves from inside the layers to between the services. The engineer’s output shifts from writing code to three things: nanoservice specification, nanoservices architecture and nanoservice templates. At the specification level, the primary artifacts are the behavioral contracts and the tests that enforce them. They are the constraints that make a non-deterministic builder produce deterministic results. The quality of AI’s output is bounded by the precision of the specification it receives. At the architecture level, the engineer designs how services combine as a whole, how data flows between them, how failures propagate, and where the system recovers. The architecture template matters more than any individual service, because it is the template that determines whether the system behaves consistently. Templates provide the boilerplate context on which the LLM operates. AI can assist with each, but the responsibility stays with you; managing the fragmented logic through specifications and architecture. These activities cost time, and expertise is learned through deliberate cognitive work.

Conclusion

When the pressure on a system exceeds what its structure can absorb, the system reorganizes. The demand for autonomous construction is that pressure. Simultaneously, the overhead that made nanoservices an anti-pattern before has collapsed. The architecture that was too expensive for human builders is precisely right for AI builders: non-deterministic builders requiring small scope.

With this shift, software complexity does not disappear, but it moves. The hours that used to go into writing and debugging implementations now go into designing the specifications, the architecture and the templates. That design work is harder than it was before, because the system is more distributed and needs to be optimized for building by a probabilistic tool. This is where the real engineering effort will live, and it demands deliberate investment.

For deterministic systems built by a non-deterministic builder, the architecture must make autonomy safe. Nanoservices are no longer the antipattern as described back in the day. They are the architecture that AI-first development requires.

Frequently Asked Questions

Should everyone do AI-first?

No, but if you do, and want to keep your codebase maintainable, then nanoservices is the way forward.

Can I let AI design the nanoservices architecture?

AI can help, but this is a human-in-the-loop activity. You are responsible for the artifacts that are being provided to the agent and they need to be optimized to implement the nanoservices.

Are nanoservices preferred over (nano)modular monoliths?

The nanoservices architecture described here is first and foremost a logical architecture, and one that is also compatible with a modular monolith repository. Tradeoffs in favour of modular monoliths, such as context availability, latency, overhead and cost are all related to the operational boundary.

Can a nanoservice contain multiple aggregates?

A nanoservice can not contain multiple aggregates. An aggregate is the largest unit that guarantees transactional consistency without coordination. If a nanoservice spans two aggregates, it must coordinate consistency across their boundaries. That reintroduces the complexity that the decomposition was meant to eliminate, and makes the unit harder to verify in isolation.

How granular should a single nanoservice be?

A domain nanoservice does not extend beyond an aggregate. Some aggregates are small enough to be a single nanoservice. Others decompose into per-operation nanoservices, where each command or query becomes its own unit. The right granularity is where each nanoservice represents one coherent, independently verifiable behavior. Smaller units are easier to verify, but some operations only make sense together.

What’s the difference between nanoservices and microservices?

A microservice owns a business capability: a cohesive set of related operations behind a single boundary. A nanoservice owns a single operation within that capability. Microservices were sized for human teams to own and reason about. Nanoservices are sized for AI agents to build and verify autonomously. The decomposition is driven by the constraint that a non-deterministic builder must work within a scope small enough to verify completely.

What does agentic software development require architecturally?

Agentic software development requires units small enough for an AI agent to own end-to-end: implementation, build, and deployment behind a behavioral contract. The architecture must constrain the agent’s scope to prevent error compounding across boundaries. Nanoservices provide this constraint. Each service has one responsibility, one contract, one deployment unit. The agent operates within scaffolding provided by templates; verification replaces review. If a nanoservice fails its contract, you replace it rather than debug it.

How do CQRS patterns work with nanoservices?

In a CQRS nanoservice architecture, the command and query sides are separated at the service boundary, not just at the code level. Each command handler, query handler, and event publisher becomes an independently deployable nanoservice. This maps naturally to the nanoservice model: each unit owns one operation (a single command or query), has a clear contract, and can be built, tested, and replaced in isolation. CQRS also simplifies the behavioral contracts that AI agents must satisfy, because commands and queries have distinct, well-defined responsibilities.