← Back to home

Context Engineering: Should You Bother?

February 18th 2026 · Mart van der Jagt

AI coding assistants now offer multiple ways to customize how the LLM behaves. Cursor calls them Rules, Commands, Skills, and Subagents. Other tools have similar concepts: Claude Code has CLAUDE.md and slash commands, Windsurf has rules files, GitHub Copilot has custom instructions. The names differ, but the patterns are the same.

There’s no shortage of articles explaining what each feature does and when to use which. Agent Skills vs. Rules vs. Commands is an excellent example if you are interested. But there is a question nobody is asking: should you invest time in any of them?

In The Future of Context Engineering, we mapped LLM limitations to the 3 dimensions that drive context engineering: context window constraints, reasoning gaps, and memory absence. We traced how each is being addressed through scaling and architectural innovation. This article applies that framework to current practices: which are worth investing in, and which are workarounds with expiration dates?

This article uses Cursor’s terminology (it’s our go-to IDE), but the analysis applies to any AI coding tool built on the same LLM foundations.

One Standard, Four Implementations

To make this concrete, let’s follow a single example through all four features.

Your company runs on Azure with strict authentication standards: Managed Identity for services within Azure, User-Assigned when you pass beyond the lifecyle of the resource, and all secrets retrieved from Key Vault. Never hardcode credentials, never use connection strings with embedded keys.

This is exactly the kind of institutional knowledge you’d want the LLM to respect. And you could implement it with any of the four features:

As a Rule: An always-on instruction that applies to every request. The LLM knows your auth patterns whether you’re writing a new service or modifying an existing one.

As a Command: An /azure-auth prompt template you invoke when working on authentication code. Explicit, on-demand.

As a Skill: A packaged capability with auth pattern instructions, reference docs, and maybe scripts to scaffold Key Vault integration. Loaded when the agent detects you’re working on auth-related code.

As a Subagent: A “security reviewer” that runs in its own context, analyzes your authentication implementation, and reports back.

The same requirement can have four different implementations. The question is: which one is worth building?

The Landscape of Context Engineering

Many practices can be considered Context Engineering: starting fresh conversations to clear irrelevant context, connecting MCPs for external capabilities, IDE-driven retrieval (RAG, codebase indexing), and established engineering practices (linting, CI). These are valuable and their investment case is clear. Rules, Commands, Skills, and Subagents however are artifacts you need to author. They require more craftsmanship and that brings more complex tradeoffs related to how long they will endure. As argued in The Future of Context Engineering, context engineering exists to address specific LLM limitations. The paper establishes five fundamental limitations across three dimensions: context window (finite space, imperfect attention), reasoning (essential vs. accidental complexity, confirmation bias), and memory (no continuous, incremental learning between sessions). Context engineering compensates for these limitations:

Table mapping LLM limitations to compensating context engineering practices: finite space is addressed by Skills, Subagents, and Rules; imperfect attention by Subagents; essential vs. accidental complexity by Rules (make unstated intent explicit); confirmation bias by Subagents (notably thin coverage); and no incremental learning by Rules and Skills.
LLM limitations mapped to compensating context engineering practices, color-coded by dimension.

Not all context engineering is the same activity. Two fundamentally different kinds of work hide under the same label.

Tooling-side context engineering is what the IDE does on your behalf: positioning instructions within the context window, compressing the context window, parallel tool calls. And in the case of Cursor, indexing your codebase and running semantic search to surface relevant code. These are all tooling-side engineering. This is work not executed by you, but by the tool. And every release automates another task that early adopters used to do by hand.

Human-side context engineering is what you do: writing rules, defining commands, packaging skills, structuring agent delegation. This is the part where the investment question matters. And it splits further into compensatory practices (working around LLM limitations) and intentional practices (expressing workflow needs that exist regardless of model capability).

The following chapter maps each practice to where it’s heading: what stays with the engineers, what the IDE absorbs, and what the LLM will handle as models improve.

The Four Features: Where to Invest

Rules: Keep Essential, Skip Elaborate

Rules address knowledge gaps. You can write instructions once and apply them everywhere. But “everywhere” means they consume tokens everywhere — including when you’re editing CSS or writing documentation. And their position in the middle of context puts them in the attention degradation zone where the model attends least reliably.

As models improve, this pressure eases from two directions. Better attention mechanisms make rules more reliable. Better reasoning makes them less necessary: a capable model, given access to your codebase, can infer your authentication patterns from existing code.

But there’s a distinction that matters more for investment decisions than model capability: some knowledge lives in the code, and some lives behind it. Keyvault usage is a pattern in the code that can be inferred. But if the pattern doesn’t exist yet in the code, the model needs to rely on your policies to take the right turn.

Recommendation: Rules for constraints that aren’t expressed in your code: policies, requirements, standards the model can’t infer. Skip elaborate rule libraries that encode patterns it can already read.

Commands: Named Intent, Not Static Replay

Commands solve a different problem entirely: human intent, not LLM limitation. They also don’t show up as compensatory practices for LLM limitations, because commands operate outside of the LLM scope. When you type /azure-auth, you’re expressing explicit intent to trigger a specific workflow.

Shell aliases and IDE macros already provide named, repeatable workflows. And this arguably puts the use case for commands more in a niche. What AI commands add is contextual generation: /azure-auth doesn’t paste a template — it generates code adapted to the specific service you’re building, using your project’s existing patterns as context. A macro replays; a command reasons. That’s the difference between static output and adaptive behavior driven by a named intent.

This extends beyond individual convenience. Teams benefit from shared named operations that every developer invokes the same way. Processes need to be auditable and reproducible. These are organizational needs that exist independently of model capability.

Recommendation: Invest in commands for workflows you repeat. They’re the most model-agnostic feature, useful regardless of how capable the underlying LLM becomes.

Skills: Capabilities Over Instructions

Skills solve two problems: token efficiency (content enters context only when relevant) and capability packaging (bundling instructions with executable scripts). As context windows grow and tooling-side retrieval improves, the pressure to manually scope what enters context decreases. But capability packaging is different. A skill that gives the model something to execute, not just reason about, persists regardless of model intelligence.

Skills that package capabilities (executable scripts, multi-step procedures, reusable scaffolding) are most likely to endure.

Recommendation: Build skills for genuinely reusable capabilities involving scripts or complex procedures that are unique to your domain. If your “skill” would just be text, use a command or a rule instead.

Subagents: The IDE’s Job, Not Yours

Subagents solve context pollution: spawning a sub-agent creates a fresh context window within the agent session and it limits the output that goes back into the orchestrator agent. They also enable parallel execution and specialization, running five agents simultaneously on different subtasks, or having a focused “explore agent” analyzing codebases. These are benefits that map to the analysis of cognitive offloading: the brain never solved confirmation bias internally. External corrective structures are the enduring pattern.

When building agentic products, subagent architecture is core design work. You should view subagents as tools themselves, with powerful reasoning capabilities of their own. This is in fact how you would set it up when building your own agent. This article is however about AI-assisted software engineering, and in that context, subagent design is mostly tooling-side context engineering. For software engineering the use case is more thin, because most use cases will cover cross-cutting concerns (analyzing codebases, running shell commands, browser testing). Covering those belongs to the IDE.

Recommendation: Although powerful, don’t invest actively in crafting subagents yourself. Express your domain knowledge through rules, commands, and skills. Let the IDE decide how to distribute that knowledge across agents.

The Code As Context

Rules, commands, skills, and subagents are all mechanisms to manage context on top of your code. But it is also important to emphasize that the code itself is the primary context. The model reads your codebase before it reads your rules. Consistent, up-to-date patterns give the model clean signals to build on. Inconsistent code gives it contradictory signal that no amount of rules can fully compensate for.

Here’s an example that served as inspiration for this article. We needed to add a Table Storage connection to an existing Azure Function. The codebase already followed our authentication standards. The codebase didn’t consistently follow our standards and the LLM added a ConnectionString parameter in localsettings.json. We added below rule to close the gap:

---
description: Enforce secure authentication standards across all Azure services
alwaysApply: true
---

**Never** use connection strings, embedded keys, or hardcoded credentials.
- Use **Azure Managed Identity** for all service authentication.
  - System-assigned for resources within the same resource group.
  - User-assigned for external azure resources.
- Store all secrets in **Azure Key Vault**.
  - Reference format: `@Microsoft.KeyVault(VaultName=<name>;SecretName=<name>)`
  - Local references tst resources unless otherwise stated
- Use `DefaultAzureCredential` exclusively.

It solved the issue, but it also exposes the elephant in the room. Besides that you can debate whether this is a well-designed rule, you can also debate whether a rule should have been needed in the first place. An inconstent codebase was patched with a rule to close the gap.

In practice, consistent best practices applied across codebases yield dramatically better AI results. We’ve seen this repeatedly ourselves: the cleaner and more consistent the code the model builds upon, the less it falls back to generic training-data defaults or wrongly inferred patterns. And something else has changed: refactoring has become considerably easier. AI assistance makes the work of bringing codebases up to consistent standards cheaper and faster than it has ever been. This creates a virtuous cycle: cleaner code leads to better AI inference, which makes further refactoring easier, which produces cleaner code.

This is context engineering, even if it isn’t being framed that way today. Your codebase is the largest, most persistent, and most influential context the model will ever read. Maintaining it is the highest-ROI investment. It reduces the surface area where rules, skills, and commands are needed in the first place.

The above assumes someone is looking at the code. Vibecoding, developers describing intent without examining what gets generated, creates a very different dynamic, and one that deserves its own exploration.

The Investment Framework

Context Engineering Investment Framework showing where to invest (Rules as project rules and team/company standards, Commands as named workflows, Skills as domain-specific executable capabilities), what gets offloaded to the IDE (Rules as semantic search indexing, Skills as best practice tooling, Subagents as built-in agents), and what gets offloaded to the LLM (Rules and Subagents via longer reasoning chains, Skills via internalized tool capabilities).
Context Engineering Investment Framework: invest in human-side context engineering.

The pattern is as follows: Tooling-side context engineering is not your problem. IDE vendors are solving it. Compensatory human-side engineering (elaborate rule libraries, manual token management, starting fresh conversations) is a temporary tax. Intentional human-side engineering (project rules, named workflows, team standards, domain-specific standards) is the durable layer.

The features that feel most magical (“set it and forget it, the LLM just knows”) are built on the shakiest foundation. They work around limitations that are actively being addressed. The features that feel most manual (“I have to explicitly invoke this”) are grounded in needs that don’t disappear when models get smarter. And that includes the hard work of grinding through your codebase, collaborating with AI on refactors, reviewing and shaping what it proposes. Because that effort compounds into the context every future request builds on.

Workarounds have expiration dates. When deciding where to invest your time, ask: Am I working around a limitation, or am I building something that will endure?


AI was used for formulation. The ideas, framework, and editorial decisions are my own.