A Context Engineering Investment Framework
AI coding assistants now offer multiple ways to customize how the LLM behaves. Cursor calls them Rules, Commands, Skills, Subagents, and Hooks. Similar concepts show up in other tools, sometimes with different names. Claude Code for instance has CLAUDE.md and slash commands and GitHub Copilot has custom instructions.
There’s no shortage of articles explaining what each feature does and when to use which. And you might even be better off if you just ask your AI chat agent. There is however a more important question few are answering: should you invest time in any of them?
In The Future of Context Engineering, we mapped LLM limitations to the 3 dimensions that drive context engineering: context window constraints, reasoning gaps, and memory absence. We traced how each is being addressed through scaling and architectural innovation.
This article applies that framework to current practices. The core question is not what these features are, but to which extent each one is worth maintaining in your environment as models and IDEs improve. The article uses Cursor’s terminology (it’s our go-to IDE), but the analysis applies to any AI coding tool built on the same LLM foundations.
The Landscape of Context Engineering
“Context engineering” is an umbrella term. It includes lightweight habits (starting a fresh chat when context gets noisy), platform capabilities (MCP integrations, RAG, code indexing), and established engineering controls (linting, CI). Most of that value is obvious and increasingly automated. This article focuses on the parts you still have to design yourself: Rules, Commands, Skills, Subagents, and Hooks. These are authored artifacts, and each asks for ongoing investment.
Briefly: Rules are system-level instructions that provide persistent context for Agent; Commands are reusable slash commands for explicit workflows; Skills are portable, version-controlled capability packages that can include scripts and load resources progressively; Subagents are specialized assistants delegated by the parent agent, each with an isolated context window; Hooks are custom scripts that observe, control, and extend the agent loop deterministically before or after defined stages.
As argued in The Future of Context Engineering, context engineering exists to compensate for concrete LLM limitations. The paper groups them into three dimensions: context window (finite space, imperfect attention), reasoning (essential vs. accidental complexity, confirmation bias), and memory (no continuous incremental learning across sessions). We can use this to evaluate how each practice offsets these limitations:
Not all context engineering is the same activity. Two fundamentally different kinds of work hide under the same label.
Tooling-side context engineering is what the IDE does for you: context placement, compression, retrieval, and parallelization. In Cursor, this includes indexing your repository and surfacing relevant code with semantic search. This work is increasingly automated release by release.
Human-side context engineering is what you do: writing rules, defining commands, packaging skills, structuring delegation, and crafting hooks. This is where the investment decision matters. It further splits into compensatory practices (workarounds for current limitations) and intentional practices (capturing workflow requirements that remain valuable regardless of model capability).
The following chapters map each practice to where it’s heading: what stays with the engineers, what the IDE absorbs, and what the LLM will handle as models improve.
The Five Features: Where to Invest
Use rules to convey intent. Define tailored design principles, your coding style or your company standards. Be concise and skip elaborate rule sets, workflows, and propagation of industry standards.
Why this recommendation?
Rules are most useful when they convey intent the model cannot always infer from your codebase: personal or organizational coding style, team priorities, and standards before patterns are consistently established.
As models improve, better attention mechanisms make rules more reliable. At the same time better reasoning makes them less necessary. What remains hardest for the model to infer is intent behind the codebase.
Elaborate rule sets also consume context window space and are irrelevant to most prompts. Keep rule content concise and focused on durable intent.
Invest in commands for recurring, context-sensitive workflows. Commands are the most targeted tool to steer reasoning, and they remain useful no matter how capable models become.
Why this recommendation?
Commands reduce prompt-writing overhead, enforce structure, and lower omission risk. When you type a command, you’re expressing explicit intent to trigger a specific workflow.
Shell aliases and IDE macros already provide named, repeatable workflows. A macro is ideal for deterministic replay. A command is stronger when output must adapt to repository conventions, surrounding code, and task-specific constraints.
At team level, commands improve consistency and auditability without requiring everyone to remember the same prompt recipe.
Build skills for genuinely reusable capabilities involving scripts or complex procedures that are unique to your domain. If your “skill” would just be text, use a command or a rule instead.
Why this recommendation?
Skills solve two problems: token efficiency (content enters context only when relevant) and capability packaging (bundling instructions with executable scripts).
As LLMs improve, token efficiency will be less of a concern, but capability packaging is different. A skill that gives the model something to execute will remain valuable even as models improve.
Skills that package capabilities (executable scripts, multi-step procedures, reusable scaffolding) are most likely to endure. If a skill is only text, a rule or command is usually a better fit.
Although powerful, don’t invest actively in crafting subagents yourself. Express your domain knowledge through rules, commands, and skills. Let the IDE decide how to distribute that knowledge across agents.
Why this recommendation?
Subagents reduce context pollution. Spawning a sub-agent creates a fresh context window within the agent session and limits what returns to the orchestrator. They also enable parallel execution and specialization, such as a focused “explore agent” analyzing codebases. These benefits map to the analysis of cognitive offloading.
When building agentic products, subagent architecture is core design work. This article is however about AI-assisted software engineering, and in that context, subagent design is mostly tooling-side context engineering.
For software engineering, most subagent use cases are cross-cutting concerns (analyzing codebases, running shell commands, browser testing). Covering those belongs to the IDE.
Invest in hooks for governance and security guardrails. Hooks are the only feature whose value is entirely independent of model capability. Better models don’t reduce the need; you want enforcement to stay deterministic.
Why this recommendation?
Rules, Commands, Skills, and Subagents all shape the model’s context probabilistically. They influence what the model sees and reasons about, but the model still interprets them. Hooks operate fundamentally different. They control what enters and exits the model’s context boundary, and they are able to do so deterministically.
What makes command-based hooks deterministic is their architecture. Hooks are spawned processes that communicate over stdio using JSON. A hook script receives structured input, evaluates it programmatically, and returns a permission field (allow, deny, or ask). The agent loop reads this field and acts on it. The model does not evaluate the decision, because the evaluation is a separate OS process.
This is crucial especially for governance and security policies where we need to be able to rely on consistent results. Security is a specialist craft with cross-cutting concerns for every organization, so it’s no surprise that third-party vendors are emerging to fill the gap. Use hooks for hard constraints where failure has organizational consequences: secrets prevention, compliance enforcement, audit logging. Use rules when the model needs to understand something to produce better output: coding standards, architectural patterns, domain conventions. The two complement each other for the same requirement. The rule increases the chance for getting it first-time right, and the hook guards against violation of the rule. Do consider that hooks are always invoked and add minor latency to your calls, which will compound for every operation.
Note that Cursor also supports prompt-based hooks, which can best be viewed as deterministically invoked subagents. In the light of context engineering this is being considered as an additional method to implement subagents.
The Code As Context
Rules, commands, skills, subagents, and hooks are all mechanisms to manage context on top of your code. But it is also important to emphasize that the code itself is the primary context. The model reads your codebase before it reads your rules. Consistent, up-to-date patterns give the model clean signals to build on. Inconsistent code gives it contradictory signal that no amount of rules can fully compensate for.
In practice, we see that consistent best practices applied across codebases yield dramatically better AI results. We’ve seen this repeatedly ourselves: the cleaner and more consistent the code the model builds upon, the less it falls back to generic training-data defaults or wrongly inferred patterns. And something else has changed: refactoring has become considerably easier. AI assistance makes the work of bringing codebases up to consistent standards cheaper and faster than it has ever been. This creates a virtuous cycle: cleaner code leads to better AI inference, which makes further refactoring easier, which produces cleaner code.
This is context engineering, even if it isn’t being framed that way today. Your codebase is the largest, most persistent, and most influential context the model will ever read. Maintaining it is the highest-ROI investment. It reduces the surface area where rules, skills and commands are needed in the first place.
The above assumes someone is looking at the code. Vibecoding, developers describing intent without examining what gets generated, creates a very different dynamic, and one that deserves its own exploration.
The Investment Framework
The pattern is as follows: Tooling-side context engineering is not your problem. IDE vendors are solving it. Compensatory human-side engineering (e.g. elaborate rule libraries and token management) is a temporary tax. Intentional human-side engineering (project rules, named workflows, team standards, domain-specific capabilities, deterministic guardrails) is the durable layer.
The features that require you to actually think, about the needs for your domain or your company, are grounded in needs that don’t disappear when models get smarter. And that includes the hard work of grinding through your codebase, collaborating with AI on refactors, reviewing and shaping what it proposes. Because that effort compounds into the context every future request builds on.
When deciding where to invest your time, ask yourself: Am I working around a limitation, or am I building something that will endure?