This is cool, it looks to me like you're integrating static analysis techniques with all the unstructured text written about dependency upgrades (changelogs, release notes, etc.). Very curious to see where it goes.
We've found dependency upgrades to be deceptively complex to evaluate safety for. Often you need context that's difficult or impossible to determine statically in a dynamically typed language. An example I use for Ruby is the kwarg migration from ruby 2.7->3 (https://www.ruby-lang.org/en/news/2019/12/12/separation-of-p...). It's trivial to profile for impacted sites at runtime but basically impossible to do it statically without adopting something like sorbet. Do you have any benchmarks on how reliable your evaluations are on plain JS vs. typescript codebases?
We ended up embracing runtime profiling for deprecation warnings / breaking changes as part of upgrading dependencies for our customers and have found that context to unlock more reliable code transformations. But you're stuck building an SDK for every language you want to support, and it's more friction than installing a github app.
One would imagine they are broadly similar; but that's off the assumption that codebases are similar as well.
Migrations between versions can have big variance largely as a function of the parent codebase and not the dependency change. A simple example of this would be a supported node version bump. It's common to lose support for older node runtimes with new dependency versions, but migrating the parent codebase may require large custom efforts like changing module systems.
This is cool, it looks to me like you're integrating static analysis techniques with all the unstructured text written about dependency upgrades (changelogs, release notes, etc.). Very curious to see where it goes.
We've found dependency upgrades to be deceptively complex to evaluate safety for. Often you need context that's difficult or impossible to determine statically in a dynamically typed language. An example I use for Ruby is the kwarg migration from ruby 2.7->3 (https://www.ruby-lang.org/en/news/2019/12/12/separation-of-p...). It's trivial to profile for impacted sites at runtime but basically impossible to do it statically without adopting something like sorbet. Do you have any benchmarks on how reliable your evaluations are on plain JS vs. typescript codebases?
We ended up embracing runtime profiling for deprecation warnings / breaking changes as part of upgrading dependencies for our customers and have found that context to unlock more reliable code transformations. But you're stuck building an SDK for every language you want to support, and it's more friction than installing a github app.
Always felt dependency updates are a perfect fit for AI agents:
(a) they’re broadly similar across companies,
(b) they aren’t time-sensitive, so the agent can take hours without anyone noticing, and
(c) customers are already accustomed to using bots here, just bad ones
One would imagine they are broadly similar; but that's off the assumption that codebases are similar as well.
Migrations between versions can have big variance largely as a function of the parent codebase and not the dependency change. A simple example of this would be a supported node version bump. It's common to lose support for older node runtimes with new dependency versions, but migrating the parent codebase may require large custom efforts like changing module systems.
Related: https://news.ycombinator.com/item?id=45436251
This is very interesting, looking forward to seeing more about it!
(I'm one of the maintainers on Renovate)
Why didn't GitHub come up with this? This seems like such an obvious use case.
It requires you to go deep in both the code analysis and the research, which is expensive at their scale
And, as someone who's start up (EdgeBit was acquired by FOSSA recently) wrote a new JS/TS static analysis engine, it's just hard to get correct.
It's a niche for AI, which creates some great opportunities for context engineering :)
GitHub hasn't done anything interesting with dependabot or code scanning for awhile.