commit 0033751900799b0bb550706d5348d585adb2be4a Author: Michel Nehme Date: Fri Mar 20 20:45:23 2026 +0000 Initial commit: design spec for session retrospective tool Claude Code session efficiency analysis tool — self-assessment of action impact + token/time retrospectives via JSONL parsing. diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000..e5e261e --- /dev/null +++ b/.gitignore @@ -0,0 +1,14 @@ +# Build artifacts +.stack-work/ +*.hi +*.o +*.dyn_hi +*.dyn_o + +# Reports (generated, local-only) +reports/ + +# Editor +*~ +*.swp +.vscode/ diff --git a/docs/design.md b/docs/design.md new file mode 100644 index 0000000..b829189 --- /dev/null +++ b/docs/design.md @@ -0,0 +1,325 @@ +# Session Retrospective Tool — Design Spec + +## Problem + +After a Claude Code session, there is no structured way to understand which actions were impactful vs. wasteful, how tokens were spent across tasks, or what could be improved. The built-in `/cost` command shows aggregate USD but no breakdown by task, phase, or impact. + +## Objective + +Enable Claude Code to self-assess its own efficiency during a session, then produce a retrospective report that cross-references qualitative impact judgments with quantitative token and time data. This creates a feedback loop for identifying what was high-impact and cheap vs. expensive and wasteful. + +## Architecture Overview + +``` +During Session After Session +───────────── ───────────── +/mn:start-tracking /mn:session-retrospective + │ │ + ▼ ▼ +analytics-db MCP Haskell executable + │ │ + ▼ ├─→ Reads JSONL files +claude_analytics DB │ (tokens, timestamps, tools) + ┌──────────────┐ │ + │ cc_sessions │◄─────────────────────├─→ Queries cc_session_phases + │ cc_phases │ │ (verdicts, tasks, notes) + └──────────────┘ │ + ├─→ Reads .meta.json + │ (subagent descriptions) + │ + ▼ + Markdown report + ai_files/session-reports/ +``` + +## Components + +### 1. Database: `claude_analytics` + +A standalone PostgreSQL database, separate from any project database. Accessed via a dedicated `analytics-db` MCP instance in `~/.claude/.mcp.json`. + +#### `cc_sessions` + +| Column | Type | Notes | +|--------|------|-------| +| `id` | `SERIAL PRIMARY KEY` | | +| `session_uuid` | `TEXT NOT NULL UNIQUE` | Claude Code session UUID | +| `project_slug` | `TEXT NOT NULL` | e.g. `-var-www-assets-cedrusconsult-com-ser-dev` | +| `started_at` | `TIMESTAMPTZ NOT NULL` | | +| `ended_at` | `TIMESTAMPTZ` | Filled by retrospective | +| `description` | `TEXT` | What the session was about | +| `total_output_tokens` | `INT` | Filled by retrospective | +| `total_input_tokens` | `INT` | Filled by retrospective | +| `total_cache_read_tokens` | `INT` | Filled by retrospective | +| `total_cache_create_tokens` | `INT` | Filled by retrospective | +| `claude_time_seconds` | `REAL` | Filled by retrospective | +| `tool_time_seconds` | `REAL` | Filled by retrospective | +| `human_time_seconds` | `REAL` | Filled by retrospective | + +#### `cc_session_phases` + +| Column | Type | Notes | +|--------|------|-------| +| `id` | `SERIAL PRIMARY KEY` | | +| `session_id` | `INT REFERENCES cc_sessions` | | +| `task` | `TEXT NOT NULL` | e.g. "fix-login-bug" | +| `phase` | `TEXT NOT NULL` | e.g. "diagnose-redirect" | +| `started_at` | `TIMESTAMPTZ NOT NULL` | | +| `ended_at` | `TIMESTAMPTZ` | Filled when phase ends | +| `actions_summary` | `TEXT` | e.g. "Grep, Read Server.hs, Read Auth.hs" | +| `verdict` | `TEXT NOT NULL` | One of the verdict categories | +| `subagents` | `TEXT` | Comma-separated descriptions of dispatched subagents | +| `useful_info` | `TEXT` | Facts/code locations discovered | +| `lessons_learned` | `TEXT` | Meta-observations for self-learning | +| `notes` | `TEXT` | Freeform | + +### 2. Verdict Categories + +| Verdict | Meaning | +|---------|---------| +| `high_impact` | Pivotal — unlocked the solution or saved significant downstream work | +| `moderate_impact` | Directly contributed to the outcome | +| `small_impact` | Minor direct contribution | +| `exploratory_useful` | Exploration that paid off | +| `exploratory_waste` | Exploration that led nowhere but was reasonable to try | +| `avoidable_waste` | Should have known better | + +### 3. Haskell Executable: `session-retrospective` + +**Location:** `~/.claude/tools/session-retrospective/` (standalone project, own `stack.yaml`) + +**CLI:** +```bash +stack exec session-retrospective -- +``` + +**Modules:** + +| Module | Responsibility | +|--------|---------------| +| `SessionRetrospective.Main` | Entry point — parse args, orchestrate | +| `SessionRetrospective.Jsonl` | Parse JSONL files into typed records (Aeson) | +| `SessionRetrospective.Phases` | Query `cc_session_phases` from PG, match to JSONL messages by timestamp | +| `SessionRetrospective.Subagents` | Parse subagent JSONLs + `.meta.json`, compute per-subagent stats | +| `SessionRetrospective.TimeAnalysis` | Classify timestamp gaps into claude/tool/human time | +| `SessionRetrospective.Report` | Generate markdown report from computed stats | + +**Key types:** + +```haskell +data TokenCounts = TokenCounts + { tcOutput :: !Int + , tcInput :: !Int + , tcCacheRead :: !Int + , tcCacheCreate :: !Int + } + +data Verdict + = HighImpact + | ModerateImpact + | SmallImpact + | ExploratoryUseful + | ExploratoryWaste + | AvoidableWaste + +data Phase = Phase + { phTask :: !Text + , phName :: !Text + , phStartedAt :: !UTCTime + , phEndedAt :: !(Maybe UTCTime) -- Nothing if phase still open + , phVerdict :: !Verdict + , phActionsSummary :: !Text + , phSubagents :: !(Maybe Text) + , phUsefulInfo :: !(Maybe Text) + , phLessonsLearned :: !(Maybe Text) + , phNotes :: !(Maybe Text) + -- Computed by joining with JSONL: + , phTokens :: !TokenCounts + , phClaudeTime :: !NominalDiffTime + , phToolTime :: !NominalDiffTime + , phToolTurns :: !Int + , phTextTurns :: !Int + , phToolResultSize :: !Int -- input bloat from tool results + } + +data SubagentStats = SubagentStats + { saDescription :: !Text + , saAgentType :: !Text + , saTokens :: !TokenCounts + , saTime :: !NominalDiffTime + , saPhase :: !Text + , saVerdict :: !Verdict + } +``` + +**Dependencies:** `aeson`, `postgresql-simple`, `time`, `text`, `bytestring`, `filepath`, `directory` + +**DB access:** Uses `postgresql-simple` (not Squeal). This is a standalone tool, not part of the ser platform `src/`. + +### 4. JSONL Data Available + +Located at `~/.claude/projects//.jsonl` with subagents at `/subagents/agent-.jsonl`. + +#### Streaming Deduplication (CRITICAL) + +The JSONL logs **streaming events**, not final messages. Multiple entries share the same `.requestId` and represent incremental chunks from one API call. Token counts within a `requestId` are cumulative — only the final chunk has the correct totals. + +**Deduplication rule:** Group assistant messages by `requestId`. For each group, take the token counts from the entry with the highest `output_tokens` value (the final streaming chunk). Sum across groups for session totals. + +#### Per assistant message: +- `.message.usage.output_tokens` — Claude generation tokens (includes thinking tokens) +- `.message.usage.input_tokens` — fresh input tokens +- `.message.usage.cache_read_input_tokens` — cached context +- `.message.usage.cache_creation_input_tokens` — new cache entries +- `.message.content[]` — tool calls (`type: "tool_use"`, `name`, `input`), text (`type: "text"`), and thinking (`type: "thinking"`, text present but no separate token count — cost is included in `output_tokens`) +- `.timestamp` — ISO 8601 +- `.requestId` — groups streaming chunks from a single API call + +#### Per user message: +- `.message.content` — either a string (human text) or an array containing `{"type": "tool_result", ...}` and/or `{"type": "text", ...}` entries +- `.timestamp` + +**Content format note:** Human text can appear as either `"content": "text"` (string) or `"content": [{"type": "text", "text": "..."}]` (array). The parser must handle both. + +#### Subagent matching + +`agent-.meta.json` contains `agentType` and `description`. The `description` matches the Agent tool_use call's `description` in the main session JSONL. + +**Fallback for missing `.meta.json`:** The `.meta.json` files are a recent Claude Code feature. Older sessions do not have them. When absent, the Haskell tool should: +1. Match subagent JSONL files to Agent tool_use calls in the main session by timestamp overlap +2. Extract the description from the Agent tool_use `input.description` field +3. If no match is found, report the subagent with its agent ID and mark description as "unknown" + +#### Time classification from timestamp gaps: + +| Category | Detection | +|----------|-----------| +| Claude processing | Gap before `assistant` message | +| Tool execution | Gap between `assistant(tool_use)` → `user(tool_result)` | +| Human wait | Gap before `user` message with human text content | + +Messages of type `progress`, `queue-operation`, and `file-history-snapshot` are ignored for time classification. + +#### Cross-matching phases to JSONL + +The phase log records `started_at`/`ended_at` timestamps. The retrospective selects all JSONL messages whose timestamps fall within each phase's window. Phase-level token counts are computed at report time from the matched JSONL messages — they are NOT stored in the database. + +#### Tool result size estimation + +`phToolResultSize` is computed as the total byte length of all `tool_result.content` strings within the phase's time window, divided by 4 (rough token estimate). This is a structural proxy, not exact. + +### 5. Skills (MN Plugin) + +Both skills live in `~/.claude/local-plugins/mn/skills/`. + +#### `/mn:start-tracking` + +1. Asks what the session is about (or takes description argument) +2. Determines the Claude Code session UUID by finding the most recently modified `.jsonl` file in `~/.claude/projects//`. Note: the file may not exist yet at the very start of a session — the skill should verify the file exists, and if not, record the UUID after the first assistant turn +3. Inserts a row into `cc_sessions` via `analytics-db` MCP +4. Records `started_at` +5. Reminds Claude of the phase logging protocol and verdict categories + +#### `/mn:session-retrospective` + +1. Closes any open phases (sets `ended_at`) +2. Runs `stack exec session-retrospective -- ` +3. Report saved to `ai_files/session-reports/-.md` +4. Claude reviews and presents the report + +### 6. MCP Configuration + +A dedicated MCP instance `analytics-db` in `~/.claude/.mcp.json`: + +```json +{ + "mcpServers": { + "analytics-db": { + "command": "node", + "args": ["/home/mnehme/.claude/mcp-servers/database-testing/index.js"], + "env": { + "DB_HOST": "localhost", + "DB_USER": "assets_servant", + "DB_PASSWORD": "...", + "DB_NAME": "claude_analytics" + } + } + } +} +``` + +Uses the same MCP server binary as `database-testing`, just with different database credentials. + +Reserved exclusively for session tracking — never reconfigured for other databases. + +### 6b. Schema Creation + +The Haskell executable runs `CREATE TABLE IF NOT EXISTS` on startup for both tables (same pattern as the ser platform's `sqInitSchema`). No manual schema setup required — just create the `claude_analytics` database and the tool self-initializes. + +### 6c. Report Output Location + +Reports are saved to `~/session-retrospective/reports/-.md` (inside the project directory, not inside any specific project's working tree), since this tool is project-agnostic. + +### 7. Report Format + +```markdown +# Session Retrospective: + +## Summary +| Metric | Value | +|--------|-------| +| Wall clock | ... | +| Claude time | ... | +| Tool time | ... | +| Human time | ... | +| Output tokens | ... | +| Input tokens (fresh) | ... | +| Cache read tokens | ... | +| Cache create tokens | ... | +| Phases | ... | + +## Verdict Breakdown +| Verdict | Phases | Out Tkn | In Tkn | Cache Read | Cache Create | Claude Time | % of Out | +|---------|--------|---------|--------|------------|-------------|-------------|----------| +| High impact | ... | ... | ... | ... | ... | ... | ... | +| Moderate impact | ... | ... | ... | ... | ... | ... | ... | +| Small impact | ... | ... | ... | ... | ... | ... | ... | +| Exploratory (useful) | ... | ... | ... | ... | ... | ... | ... | +| Exploratory (waste) | ... | ... | ... | ... | ... | ... | ... | +| Avoidable waste | ... | ... | ... | ... | ... | ... | ... | + +## Subagents +| Description | Type | Out Tkn | In Tkn | Cache Read | Cache Create | Time | Phase | Verdict | +|-------------|------|---------|--------|------------|-------------|------|-------|---------| +| ... | ... | ... | ... | ... | ... | ... | ... | ... | + +## By Task +| Task | Phases | Out Tkn | In Tkn | Cache Read | Cache Create | Claude Time | Verdict Mix | +|------|--------|---------|--------|------------|-------------|-------------|-------------| +| ... | ... | ... | ... | ... | ... | ... | ... | + +## Phase Detail +### 1. — "" () +- Actions: ... +- Output tokens: ... | Input tokens: ... | Cache read: ... | Cache create: ... +- Tool-heavy turns: ... | Text turns: ... +- Tool result input bloat: ~...K tokens +- Claude time: ... | Tool time: ... +- **Useful info:** ... +- **Lessons learned:** ... + +## Lessons Summary +1. ... +2. ... +``` + +## Explicitly Deferred + +| Feature | Why | +|---------|-----| +| Cross-session index / accumulation | Start with standalone reports first | +| `/session-trends` meta-analysis | Depends on accumulation | +| Live mid-session cost dashboard | Not needed for core learning loop | +| Automatic phase detection from JSONL | Manual tagging is more accurate and is the point | +| Dollar cost estimation | Token pricing changes; token counts are stable | +| Thinking token breakdown | `thinking` content blocks exist in JSONL with text, but no separate `thinking_tokens` field — cost is folded into `output_tokens` and cannot be isolated |