Files

T

overseer 3ef4f3e707 chore: add CLAUDE.md, stop tracking egg-info build artifacts

- Add CLAUDE.md (Claude Code orientation for the repo).
- Remove app/src/ctxd.egg-info/* from version control and gitignore
  *.egg-info/ — it is regenerated by `pip install -e` and only dirties
  the working tree.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

2026-06-29 21:11:54 +00:00

6.1 KiB

Raw Blame History

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

What CTXD is

A single-process daemon that stores per-project "context dossiers" (multiple .MD files: CONTEXT.MD, DECISIONS.MD, RUNBOOKS.MD, PROMPTS.MD, GLOSSARY.MD) and serves them to LLM harnesses (Claude, ChatGPT, Hermes, Codex, Cursor) over MCP via Streamable HTTP, plus a web UI and REST API. CONTEXT.MD can be synced to a repo as AGENTS.md with symlinks (CLAUDE.md, .cursorrules, CODEX.md). The full user/operator guide is README.md; this file is the orientation for editing the code.

All source lives in app/src/ctxd/. Run commands from app/.

Architecture (the parts that need multiple files to grasp)

One ASGI app multiplexes three protocols. server.py → CombinedApp (~line 1455) dispatches every request to: REST + Web UI, the OAuth 2.0 authorization server, and the MCP endpoints. There is no separate service per surface — "MCP is 502" and "web UI is 502" are the same process being down. serve_sync(cfg) boots it under uvicorn.
Two MCP transports. Streamable HTTP is the public one (paths in MCP_STREAMABLE_PATHS = /mcp, /readonly/mcp, /oauth/mcp, all served by CombinedApp). mcp_stdio.py is a separate stdin/stdout JSON-RPC server Hermes spawns directly — keep tool definitions in sync between the two when adding tools.
MCP tools are built in two functions, gated by scope. make_mcp_server(cfg, readonly, oauth_scoped) exposes read tools (+ everything when API-key/LAN); make_write_mcp_server(cfg) exposes the write set (update_file, set_project_tags, sync_to_project). OAuth ctxd.read vs ctxd.write scopes decide what a token sees. To add an MCP tool, edit the list_tools/call_tool handlers inside these builders.
db.py is plain functions over a conn, no ORM, dual-backend. PostgreSQL is primary; SQLite ($CTXD_HOME/ctxd.db) is the fallback when DATABASE_URL is empty (cfg.use_postgres). The same query strings run on both via _is_pg(conn) and the placeholder helper _ph(conn, n). schema.sql (PG) and schema_sqlite.sql must be kept in lockstep — a table added to one must be added to the other.
OAuth state and web sessions are file-based JSON, not in the DB. OAuthStore (oauth_state.json) and WebSessionStore (web_sessions.json) live in $CTXD_HOME (/data in the container). They survive DB swaps but are not covered by pg_dump — back them up separately.
Metadata headers are computed, never stored. build_metadata_header() prepends the header on read; strip_metadata_header() removes it before persisting. Don't store headers in the DB.
File paths are normalized to uppercase with .MD (normalize_file_path). CONTEXT.MD is the minimum file and cannot be deleted; in the ctxd-docs project, CONTEXT.MD and LLM-CLIENT.MD are also locked against update/delete.
Writes are version-checked. file_update/context_update take base_version; a mismatch is a 409 conflict. Every mutating op also writes an append-only audit_log row and may take a rotating snapshot (SNAPSHOT_MIN_KEEP/MAX_KEEP).

Latent / not wired up

schema.sql and db.py define a full collaborative-review flow — user_workspaces → workspace_files → change_requests → reviews, with workspace_fork, workspace_submit, change_request_approve, etc. As of now this is DB-layer only: it is exposed through neither server.py (REST/MCP) nor cli.py. Treat it as scaffolding, not a live feature.

Entry points

pyproject.toml defines two console scripts (and python -m ctxd dispatches the same way via __main__.py, choosing CLI vs daemon by the first arg):

dossier <command> → CLI (cli.py, cli_entry). In production these run inside the container: docker exec ctxd dossier <command>.
ctxd → starts the daemon (daemon_entry → serve_sync).

Build / run / deploy

Everything runs from app/. There is no test runner or linter configured — scripts/test_*.py are standalone MCP smoke scripts, not a suite.

# Local dev (SQLite, no Docker)
cd app
pip install -e ".[mcp]"
export CTXD_HOME=./dev-data
python -m ctxd init        # initialize DB
python -m ctxd             # serve → http://localhost:9091

# Production deploy (Docker + bundled PostgreSQL 16)
./scripts/deploy.sh        # builds ctxd, starts postgres, waits for healthy, recreates ctxd, smoke-tests /status

# MCP smoke tests against a running server
python scripts/test_unified_mcp.py
python scripts/test_write_mcp.py

# Health
curl http://localhost:9091/status   # → {"status":"ok", ...}

Deploy gotcha (the #1 source of public 502s): ctxd depends on the postgres container being up first. Always use ./scripts/deploy.sh (or docker compose up -d, which starts both). Never docker compose up -d --no-deps ctxd and never docker restart ctxd after a code change — the former crash-loops without the DB, the latter runs the old image. After editing code you must rebuild (docker compose build ctxd) before recreating.

Config

All config is env-driven through config.py (CtxConfig), precedence env var > data/ctxd.yaml > default. Key switches: DATABASE_URL (empty ⇒ SQLite fallback), CTXD_HOME (/data in container), CTXD_AUTH_ENABLED + CTXD_API_KEY (shared key for LAN/Hermes = full MCP tools), OAUTH_ENABLED + OAUTH_ISSUER (must be set together or the app won't start cleanly). The full table is in README.md.

Conventions when changing code

Adding a context-data operation: write the conn-taking function in db.py (handle both backends via _is_pg/_ph), then expose it in the relevant surface(s) — server.py for REST/MCP, cli.py for CLI, and mcp_stdio.py if Hermes needs it.
Adding a column/table: edit both schema.sql and schema_sqlite.sql; for live PG instances add a guarded migration (see _migrate_pg in db.py and the standalone migrate_*.py scripts).
Don't persist metadata headers; don't bypass base_version checking on writes; keep every mutation audit-logged.

6.1 KiB Raw Blame History