See docs: https://codeknowledge.dev/docs/Federation
On dead code detection: CKB has two modes:
1. Static analysis (findDeadCode tool, v7.6+) - requires zero instrumentation. Uses the SCIP index to find symbols with no inbound references in the codebase. Good for finding obviously dead exports, unused internal functions, etc. No telemetry needed. 2. Telemetry-enhanced (findDeadCodeCandidates, v6.4+) - ingests runtime call data to find code that exists but is never executed in production. This is where APM integration comes in.
For the telemetry integration: It hooks into any OTEL-compatible collector. No custom instrumentation required, it parses standard OTLP metrics:
- span.calls, http.server.request.count, rpc.server.duration_count, grpc.server.duration_count - Extracts function/namespace/file from span attributes (configurable via telemetry.attributes.functionKeys, etc.)
You'd configure a pipeline from your APM (Datadog, Honeycomb, Jaeger, whatever) to forward aggregated call counts to CKB's ingest endpoint. The matcher then correlates runtime function names to SCIP symbol
IDs with confidence scoring (exact: file+function+line, strong: file+function, weak: namespace+function only).
Full setup: https://codeknowledge.dev/docs/Telemetry
The static analysis mode is probably enough to start with. Telemetry integration is for when you want "this code hasn't been called in 90 days" confidence rather than "this code has no static references."
Presets control tool availability, not output truncation. The core preset exposes 19 tools (~12k tokens for definitions) vs full with 50+ tools. This affects what the AI can ask for, not what it gets back. The AI can dynamically call expandToolset mid-session to unlock additional tools when needed.
Depth parameters control which analyses run, not result pruning. For compound tools like explore: - shallow: 5 key symbols, skips dependency/change/hotspot analysis entirely - standard: 10 key symbols, includes deps + recent changes, parallel execution - deep: 20 key symbols, full analysis including hotspots and coupling
This is additive query selection. The call graph depth (1-4 levels) is passed through unchanged to the underlying traversal—if you ask for depth 3, you get full depth 3, not a truncated version.
On token optimization specifically: CKB tracks token usage at the response level using WideResultMetrics (measures JSON size, estimates tokens at ~4 bytes/token for structured data). When truncation does occur (explicit limits like maxReferences), responses include transparent TruncationInfo metadata with reason, originalCount, returnedCount, and droppedCount. The AI sees exactly what was cut and why.
The compound tools (explore, understand, prepareChange) reduce tool calls by 60-70% by aggregating what would be sequential queries into parallel internal execution. This preserves reasoning depth while cutting round-trip overhead. The AI can always fall back to granular tools (getCallGraph, findReferences) when it needs explicit control over traversal parameters.