ホームニューステックニュースSpec Kit で仕様書駆動開発を体験する

Spec Kit で仕様書駆動開発を体験する

By インモビ運営局

2025年9月4日

0

1

GitHub からリリースされたばかりの Spec Kit を試してみましたので使用感などをまとめました。
Claude Code との組み合わせを前提としています。

Spec Kitは、GitHub が公開した仕様書駆動開発ツールです。AWS の Kiro に相当するツールですが、こちらはエディタを使用せず、CLI + コーディングエージェントを組み合わせて仕様書生成からコード生成まで行うツールです。

3つのフェーズに分けて、仕様書の生成から実装計画、タスク分解までを行います。

まずはインストール

Python パッケージマネジャの uv コマンドでインストール＆実行します。

uvx --from git+https://github.com/github/spec-kit.git specify init

対話型でセットアップが始まり、Claude Code・GitHub Copilot・Gemini CLIから選択できます。
alt text

対話型のセットアップを立ち上げたくない場合は、

uvx --from git+https://github.com/github/spec-kit.git specify init --ai claude

のように指定します。

初期化が完了すると、プロジェクト名のディレクトリが作成され、その中に以下のようなファイルが生成されます。

hoge
├── memory
│   ├── constitution_update_checklist.md
│   └── constitution.md
├── scripts
│   ├── check-task-prerequisites.sh
│   ├── common.sh
│   ├── create-new-feature.sh
│   ├── get-feature-paths.sh
│   ├── setup-plan.sh
│   └── update-agent-context.sh
└── templates
    ├── agent-file-template.md
    ├── plan-template.md
    ├── spec-template.md
    └── tasks-template.md

仕様書を生成する

仕様書作成までに以下の 3 つのコマンドを使用します。

1. /specify

最初の/specifyコマンドは、自然言語で書いた「こんなアプリを作りたい」という曖昧な要求を、構造化された仕様書に変換してくれます。簡潔な要求（例：「簡単なToDoアプリを作りたい」）からでも機能要件、ユーザーストーリー、エッジケースなどを生成でき、さらに曖昧な部分には[NEEDS CLARIFICATION]マーカーが付きます。例えば「タスク編集機能は必要？」「オフライン動作は必要？」といった判断できない要件を明示してくれるので、後から要件を詰めていくことができます。より詳細な要求を与えるほど、明確で実装可能な仕様書が生成されるため、可能な限り具体的に記述することが推奨されています。

/specify "簡単なToDoアプリを作りたい" を実行すると scripts/create-new-feature.sh --json "簡単なToDoアプリを作りたい" が実行され、specs ディレクトリに仕様書が生成されます。

├── memory
│   ├── constitution_update_checklist.md
│   └── constitution.md
├── scripts
│   ├── check-task-prerequisites.sh
│   ├── common.sh
│   ├── create-new-feature.sh
│   ├── get-feature-paths.sh
│   ├── setup-plan.sh
│   └── update-agent-context.sh
├── specs

また、同時に 001-todo のような git のブランチも切られます。
仕様書には、実行フローやガイドライン、ユーザーシナリオなどが含まれています。

以下は生成されたspec.mdの例です。

spec.md

Feature Branch: 001-todo
Created: 2025-09-03
Status: Draft
Input: User description: “簡単なToDoアプリを作りたい”

 Execution Flow (main)
1. Parse user description from Input
   → If empty: ERROR "No feature description provided"
2. Extract key concepts from description
   → Identify: actors, actions, data, constraints
3. For each unclear aspect:
   → Mark with [NEEDS CLARIFICATION: specific question]
4. Fill User Scenarios & Testing section
   → If no clear user flow: ERROR "Cannot determine user scenarios"
5. Generate Functional Requirements
   → Each requirement must be testable
   → Mark ambiguous requirements
6. Identify Key Entities (if data involved)
7. Run Review Checklist
   → If any [NEEDS CLARIFICATION]: WARN "Spec has uncertainties"
   → If implementation details found: ERROR "Remove tech details"
8. Return: SUCCESS (spec ready for planning)


 ⚡ Quick Guidelines✅ Focus on WHAT users need and WHY
❌ Avoid HOW to implement (no tech stack, APIs, code structure)
👥 Written for business stakeholders, not developers

 Section Requirements
Mandatory sections: Must be completed for every feature

Optional sections: Include only when relevant to the feature
When a section doesn’t apply, remove it entirely (don’t leave as “N/A”)

 For AI GenerationWhen creating this spec from a user prompt:

Mark all ambiguities: Use [NEEDS CLARIFICATION: specific question] for any assumption you’d need to make

Don’t guess: If the prompt doesn’t specify something (e.g., “login system” without auth method), mark it

Think like a tester: Every vague requirement should fail the “testable and unambiguous” checklist item
Common underspecified areas:
User types and permissions
Data retention/deletion policies
Performance targets and scale
Error handling behaviors
Integration requirements
Security/compliance needs

 User Scenarios & Testing (mandatory)


 Primary User StoryA user wants to manage their personal tasks by creating a list where they can add new tasks, mark completed tasks as done, and remove tasks they no longer need. The application should provide a simple interface for basic task management without requiring complex features like sharing, categories, or due dates.

 Acceptance Scenarios
Given no existing tasks, When user adds a new task with text “Buy groceries”, Then the task appears in the task list as incomplete

Given a task exists in the list, When user marks it as complete, Then the task status changes to completed and is visually distinguished from incomplete tasks

Given a completed task exists, When user clicks to delete it, Then the task is removed from the list entirely

Given an incomplete task exists, When user clicks to delete it, Then the task is removed from the list entirely

Given multiple tasks exist, When user views the list, Then all tasks are displayed with their current status

 Edge CasesWhat happens when user tries to add a task with empty text?
How does the system behave when there are no tasks to display?
What happens when user tries to mark an already completed task as complete again?

 Requirements (mandatory)


 Functional Requirements
FR-001: System MUST allow users to add new tasks with descriptive text

FR-002: System MUST display all tasks in a list format showing task text and completion status

FR-003: System MUST allow users to mark incomplete tasks as completed

FR-004: System MUST allow users to delete any task (completed or incomplete)

FR-005: System MUST visually distinguish between completed and incomplete tasks

FR-006: System MUST prevent adding tasks with empty or whitespace-only text

FR-007: System MUST persist tasks so they remain available when user returns to the application

FR-008: System MUST provide immediate visual feedback when tasks are added, completed, or deleted
[NEEDS CLARIFICATION: Should completed tasks be automatically hidden after a certain time period?]
[NEEDS CLARIFICATION: Is there a maximum number of tasks that should be supported?]
[NEEDS CLARIFICATION: Should tasks be editable after creation?]
[NEEDS CLARIFICATION: Should the application work offline or require internet connection?]

 Key Entities (include if feature involves data)


Task: Represents a single item to be completed, containing descriptive text and completion status (completed/incomplete)

Task List: Collection of all tasks, maintaining order and providing operations for adding, updating, and removing tasks

 Review & Acceptance ChecklistGATE: Automated checks run during main() execution

 Content Quality
 Requirement Completeness
 Execution StatusUpdated by main() during processing

生成されたものをさらに調整したい場合はプロンプトでそのまま書いてあげると追加してくれます。当然人の手で書いても問題ありません。

2. /plan

次の/planコマンドは、仕様書で定義された「何を作るか」を「どうやって作るか」に変換します。プロジェクトの「憲法」（constitution.md）に照らし合わせたConstitution Checkが実行されます。これにより、シンプルさ、テスト駆動開発（TDD）、観測可能性などの原則に違反していないかをチェックし、違反がある場合は正当な理由を記録しておきます。

/planコマンドを実行すると、scripts/setup-plan.sh --json が実行され、plans.mdに実装計画が生成されます。/plan では実装計画以外にも research.md が生成されます。このファイルには仕様書で定義された機能を実装する前に、技術的な選択肢を調査・比較し、最適な技術スタックと実装方針を決定した過程が記録されています。

/plan が完了すると以下のようなファイルが生成されます(★のファイル)。

.
├── CLAUDE.md
├── memory
│   ├── constitution_update_checklist.md
│   └── constitution.md
├── scripts
│   ├── check-task-prerequisites.sh
│   ├── common.sh
│   ├── create-new-feature.sh
│   ├── get-feature-paths.sh
│   ├── setup-plan.sh
│   └── update-agent-context.sh
├── specs
│   └── 001-todo
│       ├── contracts

以下は生成されたplan.mdの例です。

plan.md

Branch: 001-todo | Date: 2025-09-03 | Spec: spec.md
Input: Feature specification from /specs/001-todo/spec.md

 Execution Flow (/plan command scope)
1. Load feature spec from Input path
   → If not found: ERROR "No feature spec at {path}"
2. Fill Technical Context (scan for NEEDS CLARIFICATION)
   → Detect Project Type from context (web=frontend+backend, mobile=app+api)
   → Set Structure Decision based on project type
3. Evaluate Constitution Check section below
   → If violations exist: Document in Complexity Tracking
   → If no justification possible: ERROR "Simplify approach first"
   → Update Progress Tracking: Initial Constitution Check
4. Execute Phase 0 → research.md
   → If NEEDS CLARIFICATION remain: ERROR "Resolve unknowns"
5. Execute Phase 1 → contracts, data-model.md, quickstart.md, agent-specific template file (e.g., `CLAUDE.md` for Claude Code, `.github/copilot-instructions.md` for GitHub Copilot, or `GEMINI.md` for Gemini CLI).
6. Re-evaluate Constitution Check section
   → If new violations: Refactor design, return to Phase 1
   → Update Progress Tracking: Post-Design Constitution Check
7. Plan Phase 2 → Describe task generation approach (DO NOT create tasks.md)
8. STOP - Ready for /tasks command

IMPORTANT: The /plan command STOPS at step 7. Phases 2-4 are executed by other commands:
Phase 2: /tasks command creates tasks.md
Phase 3-4: Implementation execution (manual or via tools)

 SummaryPrimary requirement: Create a simple ToDo application where users can add, complete, delete, and view tasks with persistent storage. Technical approach: Single-page web application with local storage persistence, focusing on core CRUD operations and immediate user feedback.

 Technical ContextLanguage/Version: JavaScript ES6+ with HTML5/CSS3
Primary Dependencies: No external frameworks (vanilla JS for simplicity)
Storage: localStorage (browser local storage for persistence)
Testing: Browser-based testing with simple test framework or manual testing
Target Platform: Modern web browsers (Chrome, Firefox, Safari, Edge)
Project Type: single (simple web page, no backend required)
Performance Goals: Instant response for all user interactions (

Constraints: Work offline, no server dependency, minimal resource usage
Scale/Scope: Support for hundreds of tasks per user, single user focus

 Constitution CheckGATE: Must pass before Phase 0 research. Re-check after Phase 1 design.
Simplicity:
Projects: 1 (single web page)
Using framework directly? Yes (vanilla HTML/JS/CSS, no wrappers)
Single data model? Yes (Task entity only)
Avoiding patterns? Yes (no unnecessary abstractions)
Architecture:
EVERY feature as library? N/A (single HTML page application)
Libraries listed: N/A (vanilla implementation)
CLI per library: N/A (web application)
Library docs: N/A (single file application)
Testing (NON-NEGOTIABLE):
RED-GREEN-Refactor cycle enforced? Yes (will write tests first)
Git commits show tests before implementation? Yes (will be enforced)
Order: Contract→Integration→E2E→Unit strictly followed? Modified for frontend (E2E→Integration→Unit)
Real dependencies used? Yes (actual browser localStorage)
Integration tests for: new libraries, contract changes, shared schemas? Yes (localStorage integration)
FORBIDDEN: Implementation before test, skipping RED phase
Observability:
Structured logging included? Basic console logging for errors
Frontend logs → backend? N/A (no backend)
Error context sufficient? Yes (user-facing error messages)
Versioning:
Version number assigned? 1.0.0
BUILD increments on every change? Yes
Breaking changes handled? Yes (localStorage schema versioning if needed)

 Project Structure
 Documentation (this feature)
specs/001-todo/
├── plan.md              # This file (/plan command output)
├── research.md          # Phase 0 output (/plan command)
├── data-model.md        # Phase 1 output (/plan command)
├── quickstart.md        # Phase 1 output (/plan command)
├── contracts/           # Phase 1 output (/plan command)
└── tasks.md             # Phase 2 output (/tasks command - NOT created by /plan)


 Source Code (repository root)
# Option 1: Single project (DEFAULT)
src/
├── index.html
├── style.css
├── script.js
└── lib/
    └── todo.js

tests/
├── integration/
│   └── todo-storage.test.js
└── e2e/
    └── todo-app.test.js

Structure Decision: Option 1 (single project) – simple web page with no backend required

 Phase 0: Outline & ResearchExtract unknowns from Technical Context above:
No NEEDS CLARIFICATION remain – all technical decisions made based on simple requirements
Generate and dispatch research agents:

Task: "Research localStorage best practices for persistence"
Task: "Research vanilla JS patterns for DOM manipulation"
Task: "Research testing approaches for vanilla JS applications"

Consolidate findings in research.md using format:
Decision: [what was chosen]
Rationale: [why chosen]
Alternatives considered: [what else evaluated]
Output: research.md with all technical decisions documented

 Phase 1: Design & ContractsPrerequisites: research.md complete
Extract entities from feature spec → data-model.md:
Task entity with id, text, completed status
Task collection with CRUD operations
localStorage schema definition
Generate API contracts from functional requirements:
Task management interface (add, complete, delete, list)
localStorage contract specification
DOM interaction contracts
Generate contract tests from contracts:
localStorage persistence tests
DOM manipulation tests
Task state management tests
Extract test scenarios from user stories:
Each acceptance scenario → test case
Edge cases → error handling tests
Update agent file incrementally:
Create CLAUDE.md for Claude Code context
Include ToDo app technical context
Document key patterns and conventions
Output: data-model.md, /contracts/*, failing tests, quickstart.md, CLAUDE.md

 Phase 2: Task Planning ApproachThis section describes what the /tasks command will do – DO NOT execute during /plan
Task Generation Strategy:
Load /templates/tasks-template.md as base
Generate tasks from Phase 1 design docs (contracts, data model, quickstart)
Each contract → contract test task [P]
Each entity → model creation task [P]
Each user story → integration test task
Implementation tasks to make tests pass
Ordering Strategy:
TDD order: Tests before implementation
Dependency order: Core logic before DOM manipulation before styling
Mark [P] for parallel execution (independent files)
Estimated Output: 15-20 numbered, ordered tasks in tasks.md
IMPORTANT: This phase is executed by the /tasks command, NOT by /plan

 Phase 3+: Future ImplementationThese phases are beyond the scope of the /plan command
Phase 3: Task execution (/tasks command creates tasks.md)
Phase 4: Implementation (execute tasks.md following constitutional principles)
Phase 5: Validation (run tests, execute quickstart.md, performance validation)

 Complexity TrackingFill ONLY if Constitution Check has violations that must be justified
No violations – simple single-page application meets all constitutional requirements.

 Progress TrackingThis checklist is updated during execution flow
Phase Status:
Gate Status:
Based on Constitution v2.1.1 – See /memory/constitution.md

3. /tasks

最後の/tasksコマンドは、実装計画を開発者が今すぐ着手できる具体的な作業単位まで分解してくれます。生成されるタスクリストには、詳細なタスク、各タスクの推定工数、タスク間の依存関係、受け入れ条件（Done定義）、必要なテストケースが含まれます。

/tasksコマンドを実行すると、scripts/check-task-prerequisites.sh --json が実行され、tasks.mdにタスクリストが生成されます。

.
├── CLAUDE.md
├── memory
│   ├── constitution_update_checklist.md
│   └── constitution.md
├── scripts
│   ├── check-task-prerequisites.sh
│   ├── common.sh
│   ├── create-new-feature.sh
│   ├── get-feature-paths.sh
│   ├── setup-plan.sh
│   └── update-agent-context.sh
├── specs
│   └── 001-todo
│       ├── contracts
│       │   ├── dom-interface.md
│       │   └── task-api.md
│       ├── data-model.md
│       ├── plan.md
│       ├── quickstart.md
│       ├── research.md
│       ├── spec.md
│       └── tasks.md

最終的に完成したタスクリストがこちらです。

tasks.md

Input: Design documents from /specs/001-todo/
Prerequisites: plan.md (required), research.md, data-model.md, contracts/

 Execution Flow (main)
1. Load plan.md from feature directory
   → If not found: ERROR "No implementation plan found"
   → Extract: tech stack, libraries, structure
2. Load optional design documents:
   → data-model.md: Extract entities → model tasks
   → contracts/: Each file → contract test task
   → research.md: Extract decisions → setup tasks
3. Generate tasks by category:
   → Setup: project init, dependencies, linting
   → Tests: contract tests, integration tests
   → Core: models, services, CLI commands
   → Integration: DB, middleware, logging
   → Polish: unit tests, performance, docs
4. Apply task rules:
   → Different files = mark [P] for parallel
   → Same file = sequential (no [P])
   → Tests before implementation (TDD)
5. Number tasks sequentially (T001, T002...)
6. Generate dependency graph
7. Create parallel execution examples
8. Validate task completeness:
   → All contracts have tests?
   → All entities have models?
   → All endpoints implemented?
9. Return: SUCCESS (tasks ready for execution)


 Format: [ID] [P?] Description


[P]: Can run in parallel (different files, no dependencies)
Include exact file paths in descriptions

 Path Conventions
Single project: src/, tests/ at repository root
Paths shown below assume single project per plan.md structure

 Phase 3.1: Setup
 Phase 3.2: Tests First (TDD) ⚠️ MUST COMPLETE BEFORE 3.3CRITICAL: These tests MUST be written and MUST FAIL before ANY implementation

 Phase 3.3: Core Implementation (ONLY after tests are failing)
 Phase 3.4: Integration
 Phase 3.5: Polish
 DependenciesSetup (T001-T003) before all other tasks
Tests (T004-T008) before implementation (T009-T020)
T009 (Task model) blocks T010-T011 (storage and API)
T012 (HTML) blocks T014 (DOM manipulation)
T013 (CSS) can run parallel with T014
T011 (Task API) and T014 (DOM) must complete before T017 (integration)
Implementation before polish (T021-T026)

 Parallel Example
 Phase 3.1 Setup (Run in sequence)

mkdir -p src/lib tests/integration tests/e2e
touch src/index.html src/style.css src/script.js src/lib/todo.js


 Phase 3.2 Tests (Run in parallel – different files)
 Phase 3.3 Core Implementation (Mixed parallel/sequential)
 Notes[P] tasks = different files, no dependencies
Verify tests fail before implementing
Commit after each task
Test in browser after each implementation task
Follow TDD red-green-refactor cycle strictly

 Task Generation RulesApplied during main() execution
From Contracts:
task-api.md → T004 (Task API contract test)
dom-interface.md → T005 (DOM interface contract test)
From Data Model:
Task entity → T009 (Task model implementation)
TaskCollection → T010 (Storage layer)
From User Stories (quickstart.md):
Add task scenario → Part of T007 (E2E test)
Complete task scenario → Part of T007 (E2E test)
Delete task scenario → Part of T007 (E2E test)
Input validation → T008 (Input validation test)
Ordering:
Setup → Tests → Models → Services → DOM → Integration → Polish
localStorage operations depend on Task model
DOM operations depend on HTML structure

 Validation ChecklistGATE: Checked by main() before returning

 File MappingTests (Phase 3.2):

tests/integration/task-api.test.js – T004

tests/integration/dom-interface.test.js – T005

tests/integration/todo-storage.test.js – T006

tests/e2e/todo-app.test.js – T007

tests/integration/input-validation.test.js – T008
Implementation (Phase 3.3-3.4):

src/lib/todo.js – T009, T010, T011, T018, T019

src/index.html – T012, T022

src/style.css – T013, T021

src/script.js – T014, T015, T016, T017, T020, T023, T026
Quality (Phase 3.5):
Cross-cutting testing and optimization tasks – T024, T025

 Ready for ExecutionAll 26 tasks generated with clear dependencies, file paths, and parallel execution guidance. TDD methodology enforced with comprehensive test coverage before implementation.

実装する

ここからは Spec Kit の機能ではなく、コーディングエージェントの機能を使用して実装を進めることができます。

例えばこんな感じのプロンプトです。

@specs/001-todo/plan.md に従って実装を進めて

バイブコーディングをする際のガイドラインを提供してくれるという点では有用なツールだと思いました。
散らばりがちな仕様やタスクを整理するのに役立ちそうです。

一方で、まだまだ発展途上のツールという印象も受けました。
README には、Greenfield だけでなく、既存プロジェクトにも適用できると書かれていますが、gitのブランチが自動的に切られる、CLAUDE.md や関連するファイルが自動的に生成・配置されるということを考えると、今のところ既存プロジェクトに適用するのは難しそうに思えました。既存のプロジェクトの場合、既存の仕様書やコードをどのように取り込むかがこのツールの肝となリますが、それがどのように実現できるのかという点も不明です。

研究のゴールとして、エンタープライズレベルの要件を満たすことや、反復的なプロセスでも活用できることなどが挙げられており、今後の発展が楽しみです。

Source link

返事を書く返事をキャンセル

あなたのコメントを入力してください。

ここにあなたの名前を入力してください

間違ったメールアドレスを入力しました。

ここにあなたのEメールアドレスを入力してください

バンカラジオVSヒカキンVSセイキン

『紅の砂漠』新ボス映像公開！格闘家やゴリラが襲来！

ヒカキン「開示だな」集#ひかきん

Spec Kit で仕様書駆動開発を体験する

まずはインストール

仕様書を生成する

1. /specify

Execution Flow (main)

⚡ Quick Guidelines

Section Requirements

For AI Generation

User Scenarios & Testing (mandatory)

Primary User Story

Acceptance Scenarios

Edge Cases

Requirements (mandatory)

Functional Requirements

Key Entities (include if feature involves data)

Review & Acceptance Checklist

Content Quality

Requirement Completeness

Execution Status

2. /plan

Execution Flow (/plan command scope)

Summary

Technical Context

Constitution Check

Project Structure

Documentation (this feature)

Source Code (repository root)

Phase 0: Outline & Research

Phase 1: Design & Contracts

Phase 2: Task Planning Approach

Phase 3+: Future Implementation

Complexity Tracking

Progress Tracking

3. /tasks

Execution Flow (main)

Format: [ID] [P?] Description

Path Conventions

Phase 3.1: Setup

Phase 3.2: Tests First (TDD) ⚠️ MUST COMPLETE BEFORE 3.3

Phase 3.3: Core Implementation (ONLY after tests are failing)

Phase 3.4: Integration

Phase 3.5: Polish

Dependencies

Parallel Example

Phase 3.1 Setup (Run in sequence)

Phase 3.2 Tests (Run in parallel – different files)

Phase 3.3 Core Implementation (Mixed parallel/sequential)

Notes

Task Generation Rules

Validation Checklist

File Mapping

Ready for Execution

実装する

共有:

いいね:

関連

Google Chrome と Iframe の「allow」権限の問題 – CodePen

Chris のコーナー: ステージ 2 – CodePen

Ryzen AI Max+ 395 で LLM 推論速度を比較

返事を書く 返事をキャンセル

ABOUT US

FOLLOW US

Format: `[ID] [P?] Description`

返事を書く返事をキャンセル