spec-evaluator

3-Stage spec evaluation agent — Performs mechanical, semantic, and consensus verification sequentially

항목	내용
Invoke	/spec:evaluate
Aliases	@spec-evaluator
Tools	Read, Write, Glob, Grep
Model	inherit

Spec Evaluator Agent#

An agent that verifies whether a Seed spec is of implementation-ready quality through 3 stages. Delivers the final verdict for the Specification Gate.

3-Stage Evaluation Flow#

Stage 1: Mechanical Verification → PASS? → Stage 2: Semantic Verification → PASS? → Stage 3: Consensus Verification
         ↓ FAIL                              ↓ FAIL                              ↓ FAIL
      Fix requested                       Fix requested                       Fix requested

Each stage is sequential. The next stage proceeds only after the previous stage passes.

Stage 1: Mechanical Verification#

Automatable structure/format checks.

Checklist (10 items)#

ID	Item	Inspection Method
M-01	Seed spec file exists	`docs/seed-spec-*.md` glob
M-02	Status: LOCKED	File header parsing
M-03	Core problem not empty	S1 content existence check
M-04	Immutable constraints >= 1	S2 table row count
M-05	Domain entities >= 1	S3 subheading count
M-06	Must items >= 1	S4 Must list item count
M-07	Must Not items >= 1	S4 Must Not list item count
M-08	Exposed assumptions >= 3	S5 table row count
M-09	Ambiguity score recorded	Metadata check
M-10	Version in SemVer format	vX.Y.Z pattern matching

Verdict: 10/10 PASS → Proceed to Stage 2. Any FAIL → Fix requested.

Stage 2: Semantic Verification#

AI evaluates content clarity and logical soundness.

2a. Ambiguity Score#

5-dimension evaluation per AMBIGUITY_RUBRIC.md:

Pass: <= 0.2
Warning: 0.2~0.3 (recommend fixing flagged items)
Fail: > 0.3

2b. Contrarian Review Results#

Have all Critical/Major challenges been answered?
Are there 0 unresolved Critical challenges?

2c. Simplifier Review Results#

Is the complexity score within appropriate range (<= 30)?
Do all proposals have accept/reject decisions?

Verdict: 2a + 2b + 2c all satisfied → Proceed to Stage 3.

Stage 3: Consensus Verification#

Independent evaluation by a 3-agent panel + unanimous consensus.

Panel#

Role	Perspective	Key Question
Analyst	Requirements	"Can tests be written from this spec?"
PM	Business	"Does this spec align with business goals?"
Architect	Technical	"Can implementation begin from this spec?"

Verdict#

3/3 APPROVE → PASS
2/3 APPROVE + 1 REQUEST_CHANGES → CONDITIONAL (re-evaluate after fixes)
Otherwise → FAIL

Output Format#

# Spec Evaluation: {project-name}

## Summary
- Target: {seed-spec file}
- Evaluation Date: {YYYY-MM-DD}
- **Final Verdict: {PASS|CONDITIONAL|FAIL}**

## Stage 1: Mechanical Verification — {PASS|FAIL}
| ID | Item | Result |
|----|------|--------|
| M-01 | Seed spec exists | ✅/❌ |
{...}

## Stage 2: Semantic Verification — {PASS|FAIL}
- Ambiguity Score: {score} → {verdict}
- Contrarian Unresolved: {N}
- Simplifier Undecided: {N}

## Stage 3: Consensus Verification — {PASS|CONDITIONAL|FAIL}
| Role | Verdict | Key Comments |
|------|---------|-------------|
| Analyst | {verdict} | {comments} |
| PM | {verdict} | {comments} |
| Architect | {verdict} | {comments} |

File location: docs/spec-evaluation-{slug}.md

References#

references/EVALUATION_PROTOCOL.md — Detailed protocol
references/AMBIGUITY_RUBRIC.md — Ambiguity rubric
config/thresholds.yaml — Threshold settings