Claude Code session efficiency analysis tool — self-assessment of action impact + token/time retrospectives via JSONL parsing.
14 KiB
Session Retrospective Tool — Design Spec
Problem
After a Claude Code session, there is no structured way to understand which actions were impactful vs. wasteful, how tokens were spent across tasks, or what could be improved. The built-in /cost command shows aggregate USD but no breakdown by task, phase, or impact.
Objective
Enable Claude Code to self-assess its own efficiency during a session, then produce a retrospective report that cross-references qualitative impact judgments with quantitative token and time data. This creates a feedback loop for identifying what was high-impact and cheap vs. expensive and wasteful.
Architecture Overview
During Session After Session
───────────── ─────────────
/mn:start-tracking /mn:session-retrospective
│ │
▼ ▼
analytics-db MCP Haskell executable
│ │
▼ ├─→ Reads JSONL files
claude_analytics DB │ (tokens, timestamps, tools)
┌──────────────┐ │
│ cc_sessions │◄─────────────────────├─→ Queries cc_session_phases
│ cc_phases │ │ (verdicts, tasks, notes)
└──────────────┘ │
├─→ Reads .meta.json
│ (subagent descriptions)
│
▼
Markdown report
ai_files/session-reports/
Components
1. Database: claude_analytics
A standalone PostgreSQL database, separate from any project database. Accessed via a dedicated analytics-db MCP instance in ~/.claude/.mcp.json.
cc_sessions
| Column | Type | Notes |
|---|---|---|
id |
SERIAL PRIMARY KEY |
|
session_uuid |
TEXT NOT NULL UNIQUE |
Claude Code session UUID |
project_slug |
TEXT NOT NULL |
e.g. -var-www-assets-cedrusconsult-com-ser-dev |
started_at |
TIMESTAMPTZ NOT NULL |
|
ended_at |
TIMESTAMPTZ |
Filled by retrospective |
description |
TEXT |
What the session was about |
total_output_tokens |
INT |
Filled by retrospective |
total_input_tokens |
INT |
Filled by retrospective |
total_cache_read_tokens |
INT |
Filled by retrospective |
total_cache_create_tokens |
INT |
Filled by retrospective |
claude_time_seconds |
REAL |
Filled by retrospective |
tool_time_seconds |
REAL |
Filled by retrospective |
human_time_seconds |
REAL |
Filled by retrospective |
cc_session_phases
| Column | Type | Notes |
|---|---|---|
id |
SERIAL PRIMARY KEY |
|
session_id |
INT REFERENCES cc_sessions |
|
task |
TEXT NOT NULL |
e.g. "fix-login-bug" |
phase |
TEXT NOT NULL |
e.g. "diagnose-redirect" |
started_at |
TIMESTAMPTZ NOT NULL |
|
ended_at |
TIMESTAMPTZ |
Filled when phase ends |
actions_summary |
TEXT |
e.g. "Grep, Read Server.hs, Read Auth.hs" |
verdict |
TEXT NOT NULL |
One of the verdict categories |
subagents |
TEXT |
Comma-separated descriptions of dispatched subagents |
useful_info |
TEXT |
Facts/code locations discovered |
lessons_learned |
TEXT |
Meta-observations for self-learning |
notes |
TEXT |
Freeform |
2. Verdict Categories
| Verdict | Meaning |
|---|---|
high_impact |
Pivotal — unlocked the solution or saved significant downstream work |
moderate_impact |
Directly contributed to the outcome |
small_impact |
Minor direct contribution |
exploratory_useful |
Exploration that paid off |
exploratory_waste |
Exploration that led nowhere but was reasonable to try |
avoidable_waste |
Should have known better |
3. Haskell Executable: session-retrospective
Location: ~/.claude/tools/session-retrospective/ (standalone project, own stack.yaml)
CLI:
stack exec session-retrospective -- <session-uuid>
Modules:
| Module | Responsibility |
|---|---|
SessionRetrospective.Main |
Entry point — parse args, orchestrate |
SessionRetrospective.Jsonl |
Parse JSONL files into typed records (Aeson) |
SessionRetrospective.Phases |
Query cc_session_phases from PG, match to JSONL messages by timestamp |
SessionRetrospective.Subagents |
Parse subagent JSONLs + .meta.json, compute per-subagent stats |
SessionRetrospective.TimeAnalysis |
Classify timestamp gaps into claude/tool/human time |
SessionRetrospective.Report |
Generate markdown report from computed stats |
Key types:
data TokenCounts = TokenCounts
{ tcOutput :: !Int
, tcInput :: !Int
, tcCacheRead :: !Int
, tcCacheCreate :: !Int
}
data Verdict
= HighImpact
| ModerateImpact
| SmallImpact
| ExploratoryUseful
| ExploratoryWaste
| AvoidableWaste
data Phase = Phase
{ phTask :: !Text
, phName :: !Text
, phStartedAt :: !UTCTime
, phEndedAt :: !(Maybe UTCTime) -- Nothing if phase still open
, phVerdict :: !Verdict
, phActionsSummary :: !Text
, phSubagents :: !(Maybe Text)
, phUsefulInfo :: !(Maybe Text)
, phLessonsLearned :: !(Maybe Text)
, phNotes :: !(Maybe Text)
-- Computed by joining with JSONL:
, phTokens :: !TokenCounts
, phClaudeTime :: !NominalDiffTime
, phToolTime :: !NominalDiffTime
, phToolTurns :: !Int
, phTextTurns :: !Int
, phToolResultSize :: !Int -- input bloat from tool results
}
data SubagentStats = SubagentStats
{ saDescription :: !Text
, saAgentType :: !Text
, saTokens :: !TokenCounts
, saTime :: !NominalDiffTime
, saPhase :: !Text
, saVerdict :: !Verdict
}
Dependencies: aeson, postgresql-simple, time, text, bytestring, filepath, directory
DB access: Uses postgresql-simple (not Squeal). This is a standalone tool, not part of the ser platform src/.
4. JSONL Data Available
Located at ~/.claude/projects/<project-slug>/<session-uuid>.jsonl with subagents at <session-uuid>/subagents/agent-<id>.jsonl.
Streaming Deduplication (CRITICAL)
The JSONL logs streaming events, not final messages. Multiple entries share the same .requestId and represent incremental chunks from one API call. Token counts within a requestId are cumulative — only the final chunk has the correct totals.
Deduplication rule: Group assistant messages by requestId. For each group, take the token counts from the entry with the highest output_tokens value (the final streaming chunk). Sum across groups for session totals.
Per assistant message:
.message.usage.output_tokens— Claude generation tokens (includes thinking tokens).message.usage.input_tokens— fresh input tokens.message.usage.cache_read_input_tokens— cached context.message.usage.cache_creation_input_tokens— new cache entries.message.content[]— tool calls (type: "tool_use",name,input), text (type: "text"), and thinking (type: "thinking", text present but no separate token count — cost is included inoutput_tokens).timestamp— ISO 8601.requestId— groups streaming chunks from a single API call
Per user message:
.message.content— either a string (human text) or an array containing{"type": "tool_result", ...}and/or{"type": "text", ...}entries.timestamp
Content format note: Human text can appear as either "content": "text" (string) or "content": [{"type": "text", "text": "..."}] (array). The parser must handle both.
Subagent matching
agent-<id>.meta.json contains agentType and description. The description matches the Agent tool_use call's description in the main session JSONL.
Fallback for missing .meta.json: The .meta.json files are a recent Claude Code feature. Older sessions do not have them. When absent, the Haskell tool should:
- Match subagent JSONL files to Agent tool_use calls in the main session by timestamp overlap
- Extract the description from the Agent tool_use
input.descriptionfield - If no match is found, report the subagent with its agent ID and mark description as "unknown"
Time classification from timestamp gaps:
| Category | Detection |
|---|---|
| Claude processing | Gap before assistant message |
| Tool execution | Gap between assistant(tool_use) → user(tool_result) |
| Human wait | Gap before user message with human text content |
Messages of type progress, queue-operation, and file-history-snapshot are ignored for time classification.
Cross-matching phases to JSONL
The phase log records started_at/ended_at timestamps. The retrospective selects all JSONL messages whose timestamps fall within each phase's window. Phase-level token counts are computed at report time from the matched JSONL messages — they are NOT stored in the database.
Tool result size estimation
phToolResultSize is computed as the total byte length of all tool_result.content strings within the phase's time window, divided by 4 (rough token estimate). This is a structural proxy, not exact.
5. Skills (MN Plugin)
Both skills live in ~/.claude/local-plugins/mn/skills/.
/mn:start-tracking
- Asks what the session is about (or takes description argument)
- Determines the Claude Code session UUID by finding the most recently modified
.jsonlfile in~/.claude/projects/<project-slug>/. Note: the file may not exist yet at the very start of a session — the skill should verify the file exists, and if not, record the UUID after the first assistant turn - Inserts a row into
cc_sessionsviaanalytics-dbMCP - Records
started_at - Reminds Claude of the phase logging protocol and verdict categories
/mn:session-retrospective
- Closes any open phases (sets
ended_at) - Runs
stack exec session-retrospective -- <session-uuid> - Report saved to
ai_files/session-reports/<date>-<description>.md - Claude reviews and presents the report
6. MCP Configuration
A dedicated MCP instance analytics-db in ~/.claude/.mcp.json:
{
"mcpServers": {
"analytics-db": {
"command": "node",
"args": ["/home/mnehme/.claude/mcp-servers/database-testing/index.js"],
"env": {
"DB_HOST": "localhost",
"DB_USER": "assets_servant",
"DB_PASSWORD": "...",
"DB_NAME": "claude_analytics"
}
}
}
}
Uses the same MCP server binary as database-testing, just with different database credentials.
Reserved exclusively for session tracking — never reconfigured for other databases.
6b. Schema Creation
The Haskell executable runs CREATE TABLE IF NOT EXISTS on startup for both tables (same pattern as the ser platform's sqInitSchema). No manual schema setup required — just create the claude_analytics database and the tool self-initializes.
6c. Report Output Location
Reports are saved to ~/session-retrospective/reports/<date>-<description>.md (inside the project directory, not inside any specific project's working tree), since this tool is project-agnostic.
7. Report Format
# Session Retrospective: <date> — <description>
## Summary
| Metric | Value |
|--------|-------|
| Wall clock | ... |
| Claude time | ... |
| Tool time | ... |
| Human time | ... |
| Output tokens | ... |
| Input tokens (fresh) | ... |
| Cache read tokens | ... |
| Cache create tokens | ... |
| Phases | ... |
## Verdict Breakdown
| Verdict | Phases | Out Tkn | In Tkn | Cache Read | Cache Create | Claude Time | % of Out |
|---------|--------|---------|--------|------------|-------------|-------------|----------|
| High impact | ... | ... | ... | ... | ... | ... | ... |
| Moderate impact | ... | ... | ... | ... | ... | ... | ... |
| Small impact | ... | ... | ... | ... | ... | ... | ... |
| Exploratory (useful) | ... | ... | ... | ... | ... | ... | ... |
| Exploratory (waste) | ... | ... | ... | ... | ... | ... | ... |
| Avoidable waste | ... | ... | ... | ... | ... | ... | ... |
## Subagents
| Description | Type | Out Tkn | In Tkn | Cache Read | Cache Create | Time | Phase | Verdict |
|-------------|------|---------|--------|------------|-------------|------|-------|---------|
| ... | ... | ... | ... | ... | ... | ... | ... | ... |
## By Task
| Task | Phases | Out Tkn | In Tkn | Cache Read | Cache Create | Claude Time | Verdict Mix |
|------|--------|---------|--------|------------|-------------|-------------|-------------|
| ... | ... | ... | ... | ... | ... | ... | ... |
## Phase Detail
### 1. <task> — "<phase>" (<verdict>)
- Actions: ...
- Output tokens: ... | Input tokens: ... | Cache read: ... | Cache create: ...
- Tool-heavy turns: ... | Text turns: ...
- Tool result input bloat: ~...K tokens
- Claude time: ... | Tool time: ...
- **Useful info:** ...
- **Lessons learned:** ...
## Lessons Summary
1. ...
2. ...
Explicitly Deferred
| Feature | Why |
|---|---|
| Cross-session index / accumulation | Start with standalone reports first |
/session-trends meta-analysis |
Depends on accumulation |
| Live mid-session cost dashboard | Not needed for core learning loop |
| Automatic phase detection from JSONL | Manual tagging is more accurate and is the point |
| Dollar cost estimation | Token pricing changes; token counts are stable |
| Thinking token breakdown | thinking content blocks exist in JSONL with text, but no separate thinking_tokens field — cost is folded into output_tokens and cannot be isolated |