LogoSkills

EVALUATION_PROTOCOL

3-Stage Spec Evaluation Protocol — Mechanical, semantic, and consensus verification

3-Stage Spec Evaluation Protocol#

Verifies whether a spec document is of implementation-ready quality through 3 stages. All stages must pass to clear the specification gate.


Stage 1: Mechanical Verification#

Checks structural and format completeness. Automatable checks.

Checklist#

IDInspection ItemPass Criteria
M-01Seed spec existsdocs/seed-spec-*.md file exists
M-02Status LOCKEDStatus: LOCKED confirmed
M-03Core problem sectionS1 is not empty
M-04Immutable constraintsAt least 1 constraint in S2
M-05 Domain entities At least 1 entity in S3 (attributes + relationships + invariants)
M-06Acceptance boundary MustAt least 1 item in S4 Must
M-07Acceptance boundary Must NotAt least 1 item in S4 Must Not
M-08Exposed assumptionsAt least 3 assumptions in S5
M-09Ambiguity scoreScore recorded in metadata
M-10Version formatSemVer (vX.Y.Z) format

Verdict#

  • PASS: All items pass
  • FAIL: 1 or more failures → Provide failed item list and remediation guidance

Stage 2: Semantic Verification#

Checks content clarity and logical soundness. Performed by AI agents.

2a. Ambiguity Score Evaluation#

Score calculated using the 5-dimension rubric from AMBIGUITY_RUBRIC.md.

  • Pass criteria: ambiguity_score ≤ 0.2
  • Warning: 0.2 < score ≤ 0.3 → Recommend fixing flagged items
  • Fail: score > 0.3 → Fixes required

2b. Contrarian Review#

The Contrarian agent performs the following:

  1. Assumption attack: Ask "What if this is wrong?" for each immutable constraint
  2. Scale challenge: Ask "What if users grow 10x?"
  3. Removal test: Ask "If we remove this feature, does the core value hold?"

Output: Challenge report (responses required for each challenge)

2c. Simplifier Review#

The Simplifier agent performs the following:

  1. Complexity measurement: Entity count, relationship count, constraint count
  2. Removal candidate identification: Items not contributing to core value
  3. Simplification proposals: Before/after complexity comparison

Output: Simplification report (accept/reject decision required)

Verdict#

  • PASS: Ambiguity ≤ 0.2 AND all Contrarian challenges answered AND Simplifier report reviewed
  • FAIL: Above conditions not met

Stage 3: Consensus Verification#

A 3-agent panel independently evaluates and reaches consensus.

Panel Composition#

RolePerspectivePrimary Review Areas
AnalystRequirementsCompleteness, testability, AC quality
PMBusinessValue alignment, prioritization, ROI
ArchitectTechnicalFeasibility, technical constraints, scalability

Evaluation Process#

  1. Each panel member independently reviews the Seed spec
  2. Each submits their evaluation:
    • APPROVE: Implementation-ready
    • REQUEST_CHANGES: Changes needed (specific items listed)
    • REJECT: Fundamental issues (reasons stated)
  3. Results aggregated

Verdict#

  • PASS: All 3 APPROVE
  • CONDITIONAL: 2 APPROVE + 1 REQUEST_CHANGES → Re-evaluate after fixes
  • FAIL: 2+ REQUEST_CHANGES or 1+ REJECT

Integrated Verdict#

Stage 1 PASS → Stage 2 PASS → Stage 3 PASS → ✅ Specification Gate Passed

If any stage FAILs → Fix the relevant stage and re-evaluate. Sequential execution (next stage only after previous stage passes).

Result Report Format#

## Spec Evaluation Results

### Summary
- Seed Spec: {file path}
- Evaluation Date: {YYYY-MM-DD}
- **Final Verdict: {PASS|CONDITIONAL|FAIL}**

### Stage 1: Mechanical Verification — {PASS|FAIL}
{Checklist results table}

### Stage 2: Semantic Verification — {PASS|FAIL}
- Ambiguity Score: {score} ({verdict})
- Contrarian Challenges: {N} of {M} answered
- Simplifier Proposals: {N} of {M} reviewed

### Stage 3: Consensus Verification — {PASS|CONDITIONAL|FAIL}
| Role | Verdict | Key Comments |
|------|---------|-------------|
| Analyst | {APPROVE/REQUEST_CHANGES/REJECT} | {comments} |
| PM | {verdict} | {comments} |
| Architect | {verdict} | {comments} |