Docs / Synthetic tests

Synthetic tests: end-to-end agentic workflows.

Armature runs synthetic user goals through the same agent-facing interfaces your users depend on.

Targets MCP, CLI, OpenAPI Harnesses Codex, Claude Code, OpenClaw Output Runs, alerts, repair proposals

01What runs

A synthetic test is a user goal run by a real agent against your agent-facing interface. The target can be an MCP server, CLI, OpenAPI-backed surface, or hosted Armature interface.

  • Goal: "find the latest failed deployment and explain why."
  • Agent: chooses tools, passes arguments, handles errors, and returns an answer.
  • Evidence: outcome, tool path, timing, errors, and trace quality.

02How it runs

  1. Connect. MCP, CLI, OpenAPI, or hosted Armature surface.
  2. Run. One workflow across the harness x model matrix.
  3. Review. Passes stay quiet; failures carry the repair trace.

03Harness coverage

Verify the same workflow across every harness x model pair because one agent can pass while another fails.

Example workflow Investigate a failed deployment
Model / harness Claude Code Codex Cursor OpenClaw Gemini CLI OpenCode ChatGPT Sonnet 4.6 GPT-5.5 Kimi K2.5 Qwen3.5 Coder Gemini 3 Pro

Want this against your MCP or CLI?

Bring a target and one workflow goal. We can help you get the first workflow live.

Back to docs