One of the main reasons why code is often hard to understand is that it doesn't actually do what it appears to do unless you get in and read the code in depth. It looks like skim is making an assumption that would make this happen a lot - that the code doesn't rely on side effects (modifying a global, using exceptions to control flow, throwing an event on a bus for something else to pick up, etc.)
I imagine this is a really useful tool for a codebase built on pure functions, but it'll get very confused by legacy code that hasn't been written without that goal.
Thats a great point, thank you. I use it mostly to get my agent oriented in the beginning of a task. Been using it for the last couple of weeks and I seem to get better results, less code duplication, and better integrated features. I still need to reference specific files I want the agent to work on.
I suspect, but will probably never find the time to try, that you could build an AST from the source, discard the bits that are just the internals of functions and methods, and then turn what remains back into something an LLM could use.
just plug this in to your CLAUDE.md or AGENTS.md ;)
## Codebase Analysis
**Before analyzing unfamiliar codebases**, use skim for efficient context:
```bash
# Get architectural overview (60% reduction)
skim src/ --mode structure
# Get API surface (88% reduction)
skim src/ --mode signatures
# Get type system (91% reduction)
skim src/ --mode types
When to use:
- First time exploring a repository
- Understanding service architecture
- Mapping API boundaries
- Analyzing type relationships
Install: npm install -g rskim or cargo install rskim
I built Skim to solve a specific problem: coding agents hitting context
limits when analyzing codebases.
The insight: humans don't read every line of code. We skim structure,
signatures, comments. Agents should do the same.
What it does:
- Walks your repository
- Removes function bodies and implementation details
- Keeps structure, signatures, docstrings
- Result: ~90% token reduction without semantic loss
Built in Rust for performance on large repos.
Works with most major languages (also markdown). Designed for Claude Code, GitHub Copilot, Cursor,
or any LLM that analyzes code.
GitHub: https://github.com/dean0x/skim
MIT licensed. Looking for feedback on the approach and edge cases I might
have missed.
One of the main reasons why code is often hard to understand is that it doesn't actually do what it appears to do unless you get in and read the code in depth. It looks like skim is making an assumption that would make this happen a lot - that the code doesn't rely on side effects (modifying a global, using exceptions to control flow, throwing an event on a bus for something else to pick up, etc.)
I imagine this is a really useful tool for a codebase built on pure functions, but it'll get very confused by legacy code that hasn't been written without that goal.
Thats a great point, thank you. I use it mostly to get my agent oriented in the beginning of a task. Been using it for the last couple of weeks and I seem to get better results, less code duplication, and better integrated features. I still need to reference specific files I want the agent to work on.
I suspect, but will probably never find the time to try, that you could build an AST from the source, discard the bits that are just the internals of functions and methods, and then turn what remains back into something an LLM could use.
just plug this in to your CLAUDE.md or AGENTS.md ;)
I built Skim to solve a specific problem: coding agents hitting context limits when analyzing codebases.