From 3606088c06093cfa79f9621f4346d6c8012d824e Mon Sep 17 00:00:00 2001 From: Michel Nehme Date: Fri, 20 Mar 2026 20:58:22 +0000 Subject: [PATCH] Use symlink for token data doc (source of truth in ser-dev) --- docs/claude-code-token-data.md | 150 +-------------------------------- 1 file changed, 1 insertion(+), 149 deletions(-) mode change 100644 => 120000 docs/claude-code-token-data.md diff --git a/docs/claude-code-token-data.md b/docs/claude-code-token-data.md deleted file mode 100644 index 69e5499..0000000 --- a/docs/claude-code-token-data.md +++ /dev/null @@ -1,149 +0,0 @@ -# Claude Code Token Usage Data - -## Conversation Storage - -- Path: `~/.claude/projects//.jsonl` -- Subagents: `~/.claude/projects///subagents/agent-.jsonl` -- Project slug for ser-dev: `-var-www-assets-cedrusconsult-com-ser-dev` -- Format: one JSON object per line (JSONL) - -## Streaming Deduplication (CRITICAL) - -The JSONL logs **streaming events**, not final messages. Multiple entries share the same `.requestId` and represent incremental chunks from one API call. Token counts within a `requestId` are cumulative — only the final chunk has the correct totals. - -**Deduplication rule:** Group entries by `requestId`. For each group, take the token counts from the entry with the highest `output_tokens` value (the final streaming chunk). Sum across groups for session totals. Naively summing all entries will massively overcount. - -## Per-Message Usage Fields - -Located at `.message.usage` on messages that involve API calls (not all lines have this): - -```json -{ - "input_tokens": 1, - "cache_creation_input_tokens": 205, - "cache_read_input_tokens": 76262, - "output_tokens": 308, - "server_tool_use": { - "web_search_requests": 0, - "web_fetch_requests": 0 - }, - "service_tier": "standard", - "cache_creation": { - "ephemeral_1h_input_tokens": 205, - "ephemeral_5m_input_tokens": 0 - }, - "inference_geo": "", - "iterations": [], - "speed": "standard" -} -``` - -## Other Useful Fields Per Line - -- `.timestamp` — ISO 8601 -- `.uuid` — unique message ID -- `.sessionId` — conversation session -- `.type` — `"progress"`, `"assistant"`, `"user"`, `"file-history-snapshot"` -- `.parentUuid` — message threading -- `.version` — Claude Code version (e.g. `"2.1.79"`) -- `.requestId` — groups streaming chunks from a single API call (critical for deduplication) - -## Message Types and Content Structure - -### Assistant messages (`.type == "assistant"`) - -`.message.content` is an array containing: -- `{"type": "text", "text": "..."}` — text output shown to user -- `{"type": "tool_use", "name": "Bash", "input": {...}}` — tool invocations -- `{"type": "thinking", "thinking": "..."}` — extended thinking (text IS present in JSONL, contrary to earlier belief) - -Each assistant message carries `.message.usage` with token counts. Thinking token cost is folded into `output_tokens` — there is no separate `thinking_tokens` field. - -### User messages (`.type == "user"`) - -`.message.content` is either a string or an array: -- String format: `"content": "the user typed this"` — human-typed input -- Array format: `"content": [{"type": "text", "text": "..."}, ...]` - - `{"type": "text", "text": "..."}` — human-typed input - - `{"type": "tool_result", "content": "...", "is_error": false}` — tool execution results - -**Distinguishing human input from tool results:** A user message with `tool_result` entries is an automatic tool response. A user message with `text` content (and no `tool_result`) is human input. Some messages contain both. Parsers must handle both string and array content formats. - -### Progress messages (`.type == "progress"`) - -System-level messages (skill loading, etc.). No usage data. - -### File history snapshots (`.type == "file-history-snapshot"`) - -Periodic snapshots of tracked file state. No usage data. - -## Time Analysis - -Timestamps on every message allow decomposing wall clock into three categories: - -| Category | How to detect | Typical range | -|----------|--------------|---------------| -| **Claude processing** | Gap before `assistant` message | 1–17s | -| **Tool execution** | Gap between `assistant(tool_use)` → `user(tool_result)` | 0.1s (reads) to 1400s+ (builds) | -| **Human wait** | Gap before `user(human_input)` | 10s–60min+ | - -Example from real session: a `Bash` tool call running `stack build` showed 1458s tool execution time; human gaps between conversation turns ranged from 50s to 54min. - -## Correlating Tool Calls with Token Costs - -Each assistant message contains BOTH the tool calls AND the usage for that API call. To get per-tool-call cost: -- If the message has a single tool call → usage is directly attributable -- If the message has multiple parallel tool calls → usage is the combined cost (cannot split per-tool) -- The output_tokens field covers Claude's generation (tool call arguments + any text) -- The input_tokens / cache_read fields cover the full context sent to Claude for that turn - -## Counts - -In a multi-hour session with 3 large subagent dispatches: ~583 lines with `.message.usage`. - -As of 2026-03-20: 77 sessions exist for ser-dev, 64 of which have subagent directories. - -## Built-in CLI Commands - -| Command | Shows | Does NOT Show | -|---------|-------|---------------| -| `/cost` | Aggregate USD, API duration, wall duration, lines changed | Per-type token breakdown | -| `/stats` | Usage patterns dialog | Raw token counts | - -## Thinking Tokens - -`thinking` content blocks (type `"thinking"`) exist in the JSONL with the full reasoning text present. There is no separate `thinking_tokens` field in `.message.usage` — thinking token cost is folded into `output_tokens`. This means output_tokens cannot be split into "thinking" vs "visible output" from the usage data alone, though the text content of thinking blocks is available for length-based estimation. - -## Subagent Cross-Matching - -Subagent files are at `/subagents/agent-.jsonl` with companion `agent-.meta.json`. - -**Meta file format:** -```json -{ - "agentType": "general-purpose", - "description": "Review Haskell backend changes" -} -``` - -The `description` field matches the `description` argument from the `Agent` tool_use call in the main session JSONL. This is the join key. For duplicate descriptions, timestamps disambiguate. - -**`.meta.json` availability:** This is a recent Claude Code feature. Only the most recent sessions have `.meta.json` files — older sessions (the vast majority as of 2026-03-20) do not. When absent, fall back to matching subagent JSONL files to Agent tool_use calls in the main session by timestamp overlap and extracting the description from the tool call's `input.description` field. - -**Subagent JSONL** has the same message/usage structure as the main session — independent usage blocks that must be aggregated separately. - -## Tool-Specific Token Breakdown - -There is **no per-tool token count** in the usage data. The `output_tokens` field is the combined cost of Claude's thinking + text + tool call arguments for that API turn. The `server_tool_use` field only tracks `web_search_requests` and `web_fetch_requests` (counts, not tokens) — these are always zero in sessions that don't use web search. - -**Structural proxies for tool cost:** -- Tool argument size: `len(json.dumps(tool_input))` — large Bash commands or Edit calls cost more output tokens -- Tool result bloat: `len(tool_result.content)` — large Read results or verbose Bash output inflate the next turn's input tokens -- Cache efficiency: `cache_read` vs `cache_create` ratio — high cache reads = cheap context reuse - -## Access Notes - -- Claude (the model) has no runtime access to its own token usage -- Subagent JSONL files contain independent usage blocks — must be aggregated separately -- `grep '"usage"' | wc -l` to count API-call lines -- Deep extraction: `python3 -c "..." < file` with recursive key search (not all usage is at top level) diff --git a/docs/claude-code-token-data.md b/docs/claude-code-token-data.md new file mode 120000 index 0000000..2b668ed --- /dev/null +++ b/docs/claude-code-token-data.md @@ -0,0 +1 @@ +/var/www/assets.cedrusconsult.com/ser-dev/doc/ai-infrastructure/claude-code-token-data.md \ No newline at end of file