What Amazon's Agent SOPs Teach Us About Sharing Reusable AI Workflows
In late 2025, just before re:Invent, Amazon quietly open-sourced something that deserves more attention than it got: Agent SOPs. The name is deliberately unglamorous. Standard Operating Procedures. The kind of document you associate with factory floors and compliance binders, not cutting-edge AI.
That is exactly what makes it interesting.
Agent SOPs are a standardized markdown format for defining AI agent workflows. Not another framework. Not another SDK. Just structured markdown files that tell agents what to do, step by step, with enough rigor to produce consistent results and enough flexibility to let the model think.
The idea emerged from Amazon's internal builder community, where teams deploying agents with the Strands SDK kept running into the same problem: model-driven reasoning is powerful, but it is also unpredictable. An agent that works perfectly in a demo can behave erratically under production workloads. The breakthrough, as the AWS team put it, was finding a "determin-ish-tic" sweet spot — structured enough for reliability, flexible enough for intelligence.
Amazon got several things right with this release. But the project also reveals a gap that the broader AI community has not yet closed. Let us walk through both.
What Amazon got right
1. Markdown as the universal format
The single best decision in the Agent SOPs project is the choice of plain markdown as the authoring format. Not YAML. Not JSON. Not a proprietary DSL. Markdown.
This matters more than it seems. Markdown is human-readable, version-controllable, diffable in pull requests, and understood by every LLM on the market. A developer can open an SOP in any text editor, understand what it does, modify it, and commit it alongside the code it supports.
The format also means SOPs are inherently portable. They work with Claude Code, Cursor, Cline, Kiro, Amazon Q Developer, and any MCP-compatible tool. They run directly in LLMs like Claude and GPT-4 without any translation layer. No vendor lock-in, no runtime dependency.
This is the right call. The AI tooling ecosystem is fragmenting rapidly — new frameworks and platforms appear weekly. Betting on the lowest common denominator (plain text files that models can read natively) is the only format choice that ages well.
2. RFC 2119 constraint keywords
Agent SOPs borrow the MUST / SHOULD / MAY keyword hierarchy from RFC 2119, the same standard used to write internet protocols. This is a small detail with large consequences.
Consider the difference between these two instructions:
"Validate the output format before returning results."
versus:
"You MUST validate the output format before returning results. You SHOULD include line numbers in error messages. You MAY add suggested fixes if the validation fails."
The first is ambiguous. An agent might skip validation if it seems unnecessary. The second gives the model a clear hierarchy: one thing is mandatory, one is recommended, one is optional. The agent can reason about trade-offs on the SHOULD and MAY items while never skipping the MUST items.
Here is what this looks like in practice, based on the structure from the Agent SOPs repository:
# Code Assist SOP
## Overview
Guide code implementation using test-driven development principles,
following a structured Explore-Plan-Code-Commit workflow.
## Parameters
- `task_description` (required): Description of the task to implement
- `mode` (optional, default: "interactive"): "interactive" or "fsc"
## Step 1: Setup
Initialize the project environment.
### Constraints
- You MUST validate and create the documentation directory structure
- You MUST discover existing instruction files
- You SHOULD read all discovered instruction files before proceeding
- You MAY create additional documentation directories if needed
## Step 2: Explore
Analyze the existing codebase relevant to the task.
### Constraints
- You MUST identify all files affected by the task
- You SHOULD map dependencies between affected files
- You MAY suggest scope reductions if the task is too broad
## Step 3: Plan
...
This structure gives agents clear boundaries without micromanaging every decision. It is the difference between a recipe ("do exactly this") and a professional brief ("here are the requirements and constraints, use your judgment on everything else").
3. Parameterized inputs
Instead of hardcoding specific values, SOPs accept parameters that customize behavior for different contexts. A task_description parameter means the same SOP can handle "add authentication to the API" and "refactor the database layer" without modification.
This is what transforms a one-off prompt into a reusable template. The SOP becomes a function, not a script.
4. Composability through chaining
SOPs can be chained together to execute complex, multi-phase workflows. A high-level SOP for "ship a feature" might invoke a code-task-generator SOP, then a code-assist SOP for each generated task, then a code-review SOP for the final output.
This composability is essential for scaling agent workflows beyond simple single-turn tasks.
The format in context: Agent SOPs vs. AGENTS.md vs. SOUL.md
Agent SOPs did not emerge in a vacuum. The AI development community has been converging on markdown-based configuration files from multiple directions simultaneously. Understanding where SOPs fit requires looking at the broader landscape.
AGENTS.md: how to work on this codebase
AGENTS.md, now backed by the Linux Foundation and adopted by over 20,000 repositories, is essentially a README for AI coding agents. It tells agents how to build your project, how to run tests, what code style to follow, and which patterns to use or avoid. It is project context — the equivalent of onboarding documentation for a new team member.
AGENTS.md answers the question: "What do you need to know about this codebase?"
SOUL.md: who the agent is
SOUL.md defines an agent's identity, personality, values, and boundaries. It is the configuration file for who an agent is, not what it does. A well-written SOUL.md means the agent's tone and judgment stay consistent across every interaction — important when the agent represents a brand or handles customer-facing communication.
SOUL.md answers the question: "How should you behave?"
Agent SOPs: how to execute this process
Agent SOPs fill a different niche entirely. They are procedural. They define multi-step workflows with explicit inputs, constraints, validation checkpoints, and output specifications. An SOP does not care about the codebase structure (that is AGENTS.md territory) or the agent's personality (that is SOUL.md territory). It cares about process.
Agent SOPs answer the question: "What steps should you follow to accomplish this specific task?"
Here is a side-by-side comparison:
| Aspect | AGENTS.md | SOUL.md | Agent SOPs |
|---|---|---|---|
| Purpose | Project coding instructions | Agent identity and personality | Procedural workflows |
| Scope | Per-repository | Per-agent | Per-task or per-process |
| Contains | Build steps, test commands, code conventions | Worldview, tone, boundaries, anti-patterns | Steps, parameters, constraints, validation |
| Audience | Any AI coding agent | Any conversational agent | Any agent executing a defined process |
| Backed by | Linux Foundation | Community standard | AWS / Strands Agents |
| Constraint model | Informal (prose) | Informal (prose) | Formal (RFC 2119 keywords) |
These formats are complementary, not competing. A well-configured agent workspace might use all three: AGENTS.md for codebase context, SOUL.md (or a memory/personality file) for behavioral tuning, and Agent SOPs for repeatable workflows.
The interesting observation is that all three converged independently on the same medium: plain markdown files, stored in the project directory, version-controlled with git. While the industry spent millions building vector databases and complex RAG pipelines, the most practical agent configuration turned out to be text files.
What is missing: the discovery and distribution problem
Here is where the analysis gets more interesting. Amazon built a solid format and open-sourced four example SOPs: codebase-summary, prompt-driven development, code-task-generator, and code-assist. The repository includes an MCP server implementation and CLI tooling. The authoring experience is thoughtful — you can generate SOPs by chatting with an AI agent that understands the format specification.
But what happens after you create an SOP?
Today, the answer is: you commit it to your repository, maybe share the link on social media, and hope someone finds it. The strands-agents/agent-sop repository on GitHub serves as both the specification and the de facto distribution channel. If you want to find Agent SOPs other people have created, you search GitHub. There is no registry, no catalog, no categorization, no search by use case.
This is the npm-before-npm problem. Node.js modules existed for years before npmjs.com gave developers a way to publish, discover, and install them. The format was fine. The distribution was the bottleneck.
Agent SOPs have the same gap:
No centralized discovery. You cannot browse available SOPs by category (DevOps, content creation, data analysis, code review). You cannot filter by compatibility (Claude, GPT-4, Cursor). You cannot search by problem ("I need an SOP for incident response").
No community signals. There are no download counts, ratings, reviews, or usage statistics. If someone publishes a brilliant SOP for database migration, you have no way to know it exists or that other teams have validated it.
No versioning across a shared catalog. Individual SOPs can be versioned in their own repositories, but there is no ecosystem-level version management. No way to say "I want the latest stable version of the code-review SOP that works with Claude 4."
No bundling with context. An SOP defines a workflow, but workflows do not exist in isolation. A code-assist SOP works differently depending on the agent's memory, available skills, tool configurations, and personality settings. The SOP format has no standard way to bundle these dependencies.
This is not a criticism of Amazon's release. Open-sourcing the format was the right first step. But format standardization without distribution infrastructure is like designing a shipping container without building ports.
From individual SOPs to complete workspace templates
The distribution gap points toward a broader insight: the unit of sharing for AI agent workflows is not the individual procedure. It is the complete workspace.
Think about what makes an agent effective at a task. It is rarely a single SOP in isolation. It is the combination of:
- Procedures (SOPs, workflows, step-by-step instructions)
- Memory (context about the user, the project, past interactions)
- Skills (tool integrations, API connections, specialized capabilities)
- Personality (tone, judgment, communication style)
- Configuration (model preferences, environment variables, runtime settings)
When someone shares a "code review agent," they are not sharing one SOP. They are sharing an entire workspace configuration that includes review procedures, coding standards memory, repository analysis skills, a communication style tuned for constructive feedback, and configuration for the right model and tools.
This is the workspace template model. A template packages everything an agent needs for a specific use case into a single, shareable, deployable unit. It is the Docker image to the SOP's Dockerfile instruction — not just the build steps, but the complete, runnable artifact.
Community-driven template marketplaces take this further by adding the distribution layer that raw file sharing lacks. Contributors upload complete workspace configurations. Users browse by category, read community reviews, and deploy with minimal setup. Version management happens at the platform level. Discovery is built in.
On platforms like ClawAgora, this model is already in practice. Contributors package their agent configurations — including procedural workflows, memory files, skill definitions, and personality settings — into templates that anyone can download and use. The marketplace handles discovery, categorization, and community feedback. The contributor focuses on building a great agent configuration; the platform handles distribution.
Lessons for template creators from the Agent SOPs format
Whether you are writing Agent SOPs, building workspace templates, or creating any kind of shareable agent configuration, Amazon's work offers several practical lessons worth adopting.
Use RFC 2119 keywords in your instructions
This is the single most transferable idea from the Agent SOPs project. If your template includes any procedural instructions — and most do — adopt the MUST / SHOULD / MAY hierarchy. It costs nothing, works with every model, and measurably improves consistency.
Instead of:
When reviewing code, check for security vulnerabilities and suggest
performance improvements. Include line numbers in your feedback.
Write:
When reviewing code:
- You MUST check for security vulnerabilities (SQL injection,
XSS, authentication bypasses)
- You MUST include file paths and line numbers in all feedback
- You SHOULD suggest performance improvements where applicable
- You MAY recommend refactoring opportunities if they reduce
complexity without changing behavior
The model now knows which instructions are non-negotiable and which allow discretion. This is especially valuable for templates shared with other people who may not understand the implicit priorities the original author had in mind.
Parameterize everything that varies
If your workflow includes anything specific to your project, your team, or your environment, make it a parameter. Hardcoded paths, repository names, team conventions, and language preferences should all be configurable.
A template that says "run pytest in the /tests directory" is useful to one team. A template that says "run {test_command} in the {test_directory} directory" is useful to every team.
Define explicit validation steps
Every SOP in Amazon's repository includes validation checkpoints — things the agent MUST verify before considering a step complete. This is good practice for any shared workflow.
Without validation steps, agents tend to optimistically declare success. "I reviewed the code" might mean "I skimmed the first file and moved on." Adding "You MUST confirm that every file modified in the PR has been reviewed and that all MUST-level findings have been documented" forces the agent to actually verify its work.
Structure for composability
Design your workflows as small, chainable units rather than monolithic mega-procedures. A "deploy to production" template should invoke a "run tests" procedure, a "build artifact" procedure, and a "deploy artifact" procedure — not inline all three.
This makes individual components reusable across different workflows and easier to test in isolation.
Document the context your workflow assumes
Agent SOPs are self-contained procedure definitions. But when you share a complete workspace template, document what else the agent needs. Which skills should be installed? What memory files does the workflow reference? What tools does it expect to be available?
The gap between "this SOP works on my machine" and "this template works for anyone" is usually undocumented dependencies.
The bigger picture: from prompt engineering to process engineering
Agent SOPs represent a meaningful shift in how we think about AI agent development. The early era of agent building was dominated by prompt engineering — crafting the perfect system prompt, tweaking temperature settings, finding the right words to make the model behave.
SOPs push the discipline toward process engineering. The focus moves from "how do I phrase this instruction" to "how do I design this workflow." The questions change from "what words make the model respond correctly" to "what steps, constraints, and validation checkpoints produce reliable outcomes."
This is the same maturation arc that every software discipline goes through. Web development moved from inline scripts to component architectures. DevOps moved from manual server configuration to infrastructure as code. AI agent development is now moving from artisanal prompts to structured, reusable, version-controlled workflows.
The tools and formats are converging: Agent SOPs for procedures, AGENTS.md for project context, SOUL.md for personality, workspace templates for complete configurations, and community marketplaces for distribution. None of these existed two years ago. All of them are becoming standard practice.
What makes this moment interesting is not any single format or tool. It is that the ecosystem is reaching the point where agent configurations are genuinely portable and shareable. You can build an effective agent workflow, package it as a template, share it with the community, and someone on the other side of the world can deploy it and get the same results.
Amazon's Agent SOPs are one piece of that puzzle — and an important one. The standardized format, the RFC 2119 constraints, the parameterized inputs, the composability model: all of these are good ideas that the community should adopt broadly.
The next piece is distribution. Formats define how workflows are written. Registries and marketplaces define how they are found, shared, and trusted. The AI agent ecosystem needs both.
Frequently asked questions
What are Amazon Agent SOPs?
Amazon Agent SOPs (Standard Operating Procedures) are a standardized markdown format for defining AI agent workflows. Open-sourced through the Strands Agents project, they use natural language instructions with RFC 2119 constraint keywords (MUST, SHOULD, MAY) and parameterized inputs to create reusable workflows. They work across platforms including Claude Code, Cursor, Kiro, and Amazon Q Developer, and can run directly in LLMs like Claude and GPT-4 without any translation layer.
How do Agent SOPs differ from AGENTS.md and SOUL.md?
These three formats serve different purposes and are complementary. AGENTS.md, backed by the Linux Foundation, provides project-level coding instructions — build steps, test commands, code style. SOUL.md defines an agent's personality, identity, and values. Agent SOPs define multi-step procedural workflows with explicit parameters, constraints, and validation checkpoints. You might use all three in a single project: AGENTS.md for codebase context, SOUL.md for behavioral configuration, and Agent SOPs for repeatable task execution.
Can I use Amazon Agent SOPs with tools other than AWS services?
Yes. Agent SOPs are plain markdown files with no AWS dependency. They work with any tool that accepts natural language instructions: Claude Code (copy to your project directory), Cursor (save as .cursor/rules/agent-sop-format.mdc), Cline (save to .clinerules/), any MCP-compatible tool, and directly with LLMs. The format is deliberately model-agnostic and platform-agnostic.
What is missing from the Agent SOPs ecosystem?
The primary gap is discovery and distribution. Agent SOPs are shared as files in a GitHub repository with no centralized registry, search functionality, categorization, ratings, or community feedback. There is no way to browse SOPs by use case, filter by platform compatibility, or see which SOPs other teams have validated in production. Individual SOPs also lack a standard for bundling the broader context (memory, skills, environment settings) they depend on.
How do workspace templates extend the Agent SOPs concept?
Workspace templates bundle procedural workflows together with everything else an agent needs: memory files, skill definitions, tool configurations, personality settings, and environment configuration. A template marketplace adds discovery (search and categories), trust signals (ratings, reviews, download counts), version management, and simplified deployment. This addresses the distribution gap in the current Agent SOPs ecosystem by turning isolated workflow files into complete, shareable, deployable agent configurations.