LogoSkills

feature-qa

Feature-level QA Health Score evaluation and baseline regression tracking

ํ•ญ๋ชฉ๋‚ด์šฉ
Invoke/feature-qa
Aliases/qa, /quality-check
Categorypetmedi-development
Complexitymoderate

/feature-qa#

Evaluates a 5-axis QA Health Score per feature and tracks regressions against baselines. gstack /qa pattern reference: quantitative quality score + regression baseline comparison

Triggers#

  • After feature implementation for quality verification
  • As a quality gate before PR creation
  • To check quality regression of existing features

Usage#

# Basic usage: QA by feature path
/feature-qa feature/auth/

# Regression comparison (against previous baseline)
/feature-qa feature/auth/ --regression

# Quick check (functional completeness only)
/feature-qa feature/auth/ --quick

# Specific axis only
/feature-qa feature/auth/ --focus ui,a11y

# Parallel evaluation (Agent Teams)
/feature-qa feature/auth/ --parallel

Parameters#

ParameterRequiredDescriptionExample
feature_path โœ… Feature module path feature/auth/
--regressionโŒCompare against previous baseline
--quickโŒQuick check (functional completeness only)
--focus โŒ Focus on specific axes ui, a11y, perf, i18n, func
--parallel โŒ 5-axis parallel evaluation (requires Agent Teams)
--save-baselineโŒSave baseline (default: auto-save)
--no-saveโŒDo not save baseline

5-Axis Evaluation System#

Functional Completeness โ€” Weight 30%#

ItemPointsInspection Method
Acceptance Criteria met30Map issue AC to implementation code
Edge case handling25Check null/empty/boundary handling
Error handling25try-catch, ErrorState, user messages
Test coverage20UseCase 100%, BLoC 80%+ target

Inspection Items:

  • Are all ACs mapped to implementation code
  • Is null safety handled properly
  • Is there user feedback on network errors
  • Is empty list state (EmptyState) handled
  • Is loading state displayed

UI/UX (Visual) โ€” Weight 20%#

ItemPointsInspection Method
Correct CoUI component usage30Verify component API compliance
Layout consistency25Gap/Insets constants used, alignment
Color system compliance25appColors/colorScheme used
Typography compliance20context.textStyles used

Inspection Items:

  • Are CoUI component APIs used correctly (features parameter, etc.)
  • Are Gap/Insets constants used (no hardcoded values)
  • Are colors accessed via context.appColors / context.colorScheme
  • Are text styles accessed via context.textStyles
  • Are non-existent APIs like ButtonSize.medium not used

Accessibility โ€” Weight 20%#

ItemPointsInspection Method
Semantic Labels35Labels present on images, icons, buttons
Touch target size30Minimum 48x48 verified
Color contrast20WCAG AA criteria
Screen reader order15Logical tab order

Inspection Items:

  • Do Image/Icon have semanticLabel
  • Are touchable elements at least 48x48
  • Is text-background color contrast 4.5:1 or above
  • Are widgets arranged in a meaningful order

Performance โ€” Weight 15%#

ItemPointsInspection Method
const widget usage30Check const-eligible widgets
BlocBuilder optimization25buildWhen/listenWhen usage
Image optimization25cacheWidth/cacheHeight applied
Resource cleanup20dispose/cancel verified

Inspection Items:

  • Are const-eligible widgets declared as const
  • Is buildWhen applied to BlocBuilder
  • Do network images have cacheWidth/cacheHeight
  • Are Stream subscriptions cancelled in dispose
  • Is BLoC isClosed checked in async handlers

Internationalization (i18n) โ€” Weight 15%#

ItemPointsInspection Method
Translation key usage40context.t.* pattern used
No hardcoded strings30UI text hardcoding detection
Pluralization handling15plural/ordinal applied
Dynamic value parameterization 15 Parameters instead of string interpolation

Inspection Items:

  • Do all UI texts use context.t.*
  • Are there no hardcoded Korean/English strings
  • Is pluralization handled for texts containing numbers
  • Are dynamic values passed as parameters

Health Score Calculation#

Score Computation#

Total = Sum(each axis score x weight) - (Critical issue count x 10)

Each axis score = Sum(item scores) / 100 x 100

Grade Criteria#

GradeScore RangeMeaning
A90-100Excellent โ€” production ready
B80-89Good โ€” minor improvements needed
C70-79Average โ€” improvements recommended
D60-69Insufficient โ€” improvements required
F0-59Poor โ€” must fix before PR creation

Baseline Storage and Regression Comparison#

Baseline Storage#

Automatically saved to .qa-baseline/{feature-name}.json on QA execution:

{
   " feature " :  " auth " ,
   " date " :  " 2026-03-13T10:30:00Z " ,
   " grade " :  " B " ,
   " totalScore " : 85,
   " scores " : {
     " functional " : {  " score " : 90,  " weight " : 0.30,  " weighted " : 27.0 },
     " visual " : {  " score " : 80,  " weight " : 0.20,  " weighted " : 16.0 },
     " accessibility " : {  " score " : 75,  " weight " : 0.20,  " weighted " : 15.0 },
     " performance " : {  " score " : 85,  " weight " : 0.15,  " weighted " : 12.75 },
     " i18n " : {  " score " : 90,  " weight " : 0.15,  " weighted " : 13.5 }
  },
   " criticalIssues " : 0,
   " penalty " : 0,
   " issues " : [
    {
       " axis " :  " accessibility " ,
       " severity " :  " warning " ,
       " message " :  " semantic label missing " ,
       " file " :  " lib/src/presentation/widget/author_card.dart " ,
       " line " : 42
    }
  ]
}

Regression Comparison#

When using the --regression option, compares against the previous baseline:

## Regression Report: auth

| Axis | Previous | Current | Change |
|------|----------|---------|--------|
| Functional Completeness | 90 | 92 | โœ… +2 |
| UI/UX | 80 | 80 | โžก๏ธ 0 |
| Accessibility | 75 | 70 | โš ๏ธ -5 |
| Performance | 85 | 88 | โœ… +3 |
| i18n | 90 | 90 | โžก๏ธ 0 |
| **Total** | **85 (B)** | **84 (B)** | โš ๏ธ -1 |

### Regression Items (score decrease)
- โ™ฟ Accessibility -5 points: semanticLabel missing on newly added ProfileImage

Output Format#

โ•”โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•—
โ•‘  Feature QA Health Score: auth                                โ•‘
โ• โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ฃ
โ•‘                                                                โ•‘
โ•‘  Grade: B (85/100)                                             โ•‘
โ•‘                                                                โ•‘
โ•‘  ๐Ÿงช Functional     โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–‘โ–‘  90/100 (ร—0.30 = 27.0) โ•‘
โ•‘  ๐ŸŽจ UI/UX          โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–‘โ–‘โ–‘โ–‘โ–‘โ–‘  80/100 (ร—0.20 = 16.0) โ•‘
โ•‘  โ™ฟ Accessibility   โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘  75/100 (ร—0.20 = 15.0) โ•‘
โ•‘  โšก Performance     โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–‘โ–‘โ–‘โ–‘โ–‘โ–‘  85/100 (ร—0.15 = 12.8) โ•‘
โ•‘  ๐ŸŒ i18n           โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–‘โ–‘  90/100 (ร—0.15 = 13.5) โ•‘
โ•‘                                                                โ•‘
โ•‘  Critical: 0 items | Penalty: 0 points                         โ•‘
โ•‘  Baseline: Saved (.qa-baseline/auth.json)                      โ•‘
โ•‘                                                                โ•‘
โ•šโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•

### Issues Found (5 items)

| # | Axis | Severity | Issue | File |
|---|------|----------|-------|------|
| 1 | โ™ฟ Accessibility | โš ๏ธ | semantic label missing | author_card.dart:42 |
| 2 | โšก Performance | โš ๏ธ | cacheWidth not specified | author_image.dart:15 |
| 3 | ๐ŸŒ i18n | ๐Ÿ’ก | hardcoded string | author_list_page.dart:28 |
| 4 | ๐ŸŽจ UI/UX | ๐Ÿ’ก | SizedBox used instead of Gap | author_form.dart:55 |
| 5 | ๐Ÿงช Functional | ๐Ÿ’ก | EmptyState not handled | author_list_bloc.dart:30 |

Parallel Evaluation (--parallel, Agent Teams)#

Prerequisites#

  • CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1 environment variable set
  • If Agent Teams unavailable, automatically falls back to sequential evaluation

Parallel Execution Flow#

When --parallel option is used:

1. Check Agent Teams availability
   helpers.md#Check-Agent-Teams-Available
   โ†’ Falls back to sequential evaluation if unavailable

2. 3-team parallel evaluation (3 Teammates)

   โ”œโ”€ Teammate 1: Functional Completeness (30%)
   โ”‚   โ”œโ”€ Acceptance Criteria met
   โ”‚   โ”œโ”€ Edge case handling
   โ”‚   โ”œโ”€ Error handling
   โ”‚   โ””โ”€ Test coverage
   โ”‚
   โ”œโ”€ Teammate 2: UI/UX (20%) + Accessibility (20%)
   โ”‚   โ”œโ”€ Correct CoUI component usage
   โ”‚   โ”œโ”€ Layout/color/typography consistency
   โ”‚   โ”œโ”€ Semantic Labels
   โ”‚   โ”œโ”€ Touch target size
   โ”‚   โ””โ”€ Color contrast
   โ”‚
   โ””โ”€ Teammate 3: Performance (15%) + i18n (15%)
       โ”œโ”€ const widgets / BlocBuilder optimization
       โ”œโ”€ Image optimization / resource cleanup
       โ”œโ”€ Translation key usage
       โ””โ”€ Hardcoded string detection

3. Lead merges results
   โ”œโ”€ helpers.md#Collect-Team-Results
   โ”œโ”€ Sum each axis score and apply weights
   โ”œโ”€ Critical issue count x 10 point penalty
   โ””โ”€ Final grade calculation (A~F)

Fallback (Sequential Evaluation)#

If Agent Teams unavailable, runs 5-axis sequential evaluation as before.


Automation#

# Run tests (for coverage verification)
melos run test:with-html-coverage -- --scope= " *auth* " 

 # Static analysis
melos run analyze -- --scope= " *auth* " 

 # Code review (detailed)
/code-review feature/auth/ --gate-mode

  • /code-review - 8-category code review
  • /checklist:feature-complete - Feature complete checklist
  • /dev:run - Full development cycle (can be auto-invoked at Step 8.7)