1 comments

  • prash2488 an hour ago

    I've worked on SourceSailor, a CLI tool that tries to tackle this exact problem, though I should note it's still in early stages. While it can't yet fully map complex codebases like the Linux kernel (that's a significant challenge), it does provide some useful capabilities for understanding smaller to medium-sized codebases. SourceSailor generates a structural understanding of your codebase and creates reports about dependencies and project architecture. It leverages LLMs (OpenAI, Anthropic, or Gemini) for analysis and allows you to ignore files you don't want to analyze (following how .gitignore is used and parsed) to focus on relevant parts of the codebase. However, I should be clear about its limitations:

    - It's not yet as interactive as Cursor or Aider, and I am not planning to make it like that

    - Large codebases (like Linux) would be challenging due to token limits of current LLMs. Though gemini may help, but we all know it's privacy policy shenanigans.

    - The analysis is more high-level rather than detailed implementation specifics. Though it helps you to understand the codebase, and it tries to explain interesting parts, but ymmv...

    If you're specifically looking to understand massive codebases like Linux, SourceSailor probably isn't the straightforward yet, and there will be workarounds. But if you're working with smaller to medium projects and need help understanding their structure and dependencies, it might be worth trying. The project is open source if you want to check it out or contribute: https://github.com/PrashamTrivedi/SourceSailor-CLI