There's very few startups that I look at these days and don't think to myself, "I could just write a Claude skill for that". This one seems pretty cool. Congrats on launch
I love the launch! Automated observability that feeds back into the product development process is the future of this category vs having to spend a lot of time configuring and managing the infrastructure yourself.
It's something we've thought a lot about at Amplitude. We'd love to talk.
Interesting project - but you need to add some information on where the data goes. As far as I can tell, code goes to some upstream ai provider (for installing, for analyzing).
Telemetry goes to some provider or local hosted solution? And then to your upstream ai provider for analysis?
When you're installing Superlog, you can use any coding agent you'd like, including a local model.
Your telemetry then goes into our data stores, and right now we have one DC on the US west coast.
Whenever there's an error log or trace, Superlog can analyze it and prepare a resolution PR (or a note if something needs to be done manually).
This can be turned off and then the incident can be sent to your own models via a webhook.
We use one of the frontier models for that (it's an upstream AI provider). We're working on our own fine-tuned version of a SoTA model to minimize dependency on other AI providers.
To investigate an incident, we clone the repo in our worker, and pass the repository files to a coding agent in a sandbox. The agent has an MCP that gives it access to the telemetry (logs/metrics/traces) of the project.
The coding agent will then investigate the incident and prepare a patch. It hands over the patch via a tool. The worker then deterministically pushes the patch to a branch and opens the PR.
This way the agent doesn't have full Git access and can't do anything it's not supposed to do in the repository.
Congrats on the launch, this looks very promising. I hadn't seen any installation that uses a URL to point to a skill, seems like an evolution of wizard scripts
That been said for more complex setups like on kubernetes where you need a collector and an operator I found OTEL to be super painful to setup a couple of years ago. Has it gotten any easier now?
The "Confidence Gate" concept is the most interesting
part here — auto-generated PRs are only useful if the
fix is actually correct. What's the failure rate on
the PR suggestions in practice? Do you have data on
how often developers accept vs. reject the prepared PRs?
Also curious about the MCP integration — treating
observability data as a tool call rather than a dashboard
you have to context-switch into is a genuinely different
mental model. Makes sense for agentic workflows.
The npx onboarding is clever. One-prompt install removes
the biggest blocker for observability adoption (nobody
wants to spend a week instrumenting their codebase).
Very good point on the confidence gate! We've rolled out feedback collection features on PRs themselves, on Slack notifications and incidents a few days ago so the data is still a bit fresh.
Anecdotally, our top clients accept 80-90% of PRs, with several clients accepting all of them and requesting an auto-merge feature. I myself accept most of Superlog PRs to Superlog. Most PRs that stay unmerged are usually due to a client losing interest in our product / abandoning the instrumented project.
Another interesting point is that not every defect is a PR. Often it's misconfiguration in an external service, so there's a special incident state for that. For example, yesterday I forgot to verify our domain on Resend so some verification emails didn't go through. Superlog pinged me on Slack and explained where to go to fix.
Super glad you like the npx onboarding and the MCP tool :) Please keep the feedback coming!
Love the concept! Some feedback: I went to sign up to give it a go, but the set up process left me feeling a bit untrusting - so I backed out for now. I'd prefer more explanation about what to expect, what I will get, how it is safe, etc before asking me to run a prompt.
Right now, the prompt will enumerate all the services and install the OpenTelemetry SDK (https://opentelemetry.io/) in each service.
Then for every service, the skill will make sure that:
- Every time something breaks and an operator needs to take a look, there's an error log
- All important steps in a process emit info/debug logs (so that an issue can be investigated)
- Operations are covered with spans with relevant attributes.
- Cost (LLM tokens), API performance (latency/RED), tenant activity (cost/usage per tenant) are covered by metrics so that you can use Superlog MCP to build cool dashboards.
For most common stacks like NextJS, FastAPI, React Native/Expo etc. we have a custom skill that explains the best practices for this specific technology. For all the other stacks we ask the agent to use general best practices.
We have evals for all custom skills where we start from a starter project, run the agent with the skill and use LLM-as-a-judge to compare it to a human-written 'golden patch'.
In general, we try to:
- minimize diff, so that the instrumentation is easy to review
- make small chunks of additive diffs vs huge indents / moving logic around
- minimize new dependencies
- use well-supported and audited OTel SDKs vs custom libs
I made the Slack onboarding step mandatory for now since we thought that a lot of our value was in sending investigations and PRs, and Slack was what we used ourselves.
What tool do you use for communication around your project? If you don't want to share publicly, could you please shoot a line to:
ash [at] superlog.sh?
Would love to learn about your usecase in more detail too!
Got it! What channel would you prefer instead? Would Telegram/WhatsApp/Signal/iMessage be good?
The platform itself doesn't need Slack to function, we just observed that users got more value if they could get notifications somehow, so I'm more than happy to add more comms platforms :)
The typical issues I have seen with LLMs / Agents tend to be reactive in their fixes. So they tend to "patch" the symptom more than "fix" the root cause. Interested to see how you solve this problem.
You're right! It's a big issue and I don't think there's a silver bullet.
We have an eval suite with code+telemetry fixtures and a golden RCA+patches and an LLM-as-a-Judge. So whenever we get feedback from our users and they're OK with it, we use their feedback to create an eval case (it's still quite manual since you have to calibrate the case).
We use Superlog to observe Superlog, so I often extract cases from our own errors. The PRs get better and better, but, of course, it's sort of a continuous improvement process.
Either it's a tool for observing or it's a tool for fixing issues, it cannot be both, by physical principle.
Best case scenario here is that the product succeeds, and then you need to instrument the product itself in order to observe it, like debugging the debugger. But it wouldn't be an observability tool, it would shift the product that needs to be observed from the previous source code that is now a target language into the new source code that is now your product.
There's very few startups that I look at these days and don't think to myself, "I could just write a Claude skill for that". This one seems pretty cool. Congrats on launch
Thank you! super happy that's how you feel about Superlog. Let us know if you want to try it out and/or have any feedback :)
I love the launch! Automated observability that feeds back into the product development process is the future of this category vs having to spend a lot of time configuring and managing the infrastructure yourself.
It's something we've thought a lot about at Amplitude. We'd love to talk.
Awesome, let's definitely have a chat! I'll shoot an email via BF :)
Interesting project - but you need to add some information on where the data goes. As far as I can tell, code goes to some upstream ai provider (for installing, for analyzing).
Telemetry goes to some provider or local hosted solution? And then to your upstream ai provider for analysis?
Thanks for the feedback!
When you're installing Superlog, you can use any coding agent you'd like, including a local model.
Your telemetry then goes into our data stores, and right now we have one DC on the US west coast.
Whenever there's an error log or trace, Superlog can analyze it and prepare a resolution PR (or a note if something needs to be done manually).
This can be turned off and then the incident can be sent to your own models via a webhook.
We use one of the frontier models for that (it's an upstream AI provider). We're working on our own fine-tuned version of a SoTA model to minimize dependency on other AI providers.
To investigate an incident, we clone the repo in our worker, and pass the repository files to a coding agent in a sandbox. The agent has an MCP that gives it access to the telemetry (logs/metrics/traces) of the project.
The coding agent will then investigate the incident and prepare a patch. It hands over the patch via a tool. The worker then deterministically pushes the patch to a branch and opens the PR.
This way the agent doesn't have full Git access and can't do anything it's not supposed to do in the repository.
Congrats on the launch, this looks very promising. I hadn't seen any installation that uses a URL to point to a skill, seems like an evolution of wizard scripts
That been said for more complex setups like on kubernetes where you need a collector and an operator I found OTEL to be super painful to setup a couple of years ago. Has it gotten any easier now?
The "Confidence Gate" concept is the most interesting part here — auto-generated PRs are only useful if the fix is actually correct. What's the failure rate on the PR suggestions in practice? Do you have data on how often developers accept vs. reject the prepared PRs?
Also curious about the MCP integration — treating observability data as a tool call rather than a dashboard you have to context-switch into is a genuinely different mental model. Makes sense for agentic workflows.
The npx onboarding is clever. One-prompt install removes the biggest blocker for observability adoption (nobody wants to spend a week instrumenting their codebase).
Very good point on the confidence gate! We've rolled out feedback collection features on PRs themselves, on Slack notifications and incidents a few days ago so the data is still a bit fresh.
Anecdotally, our top clients accept 80-90% of PRs, with several clients accepting all of them and requesting an auto-merge feature. I myself accept most of Superlog PRs to Superlog. Most PRs that stay unmerged are usually due to a client losing interest in our product / abandoning the instrumented project.
Another interesting point is that not every defect is a PR. Often it's misconfiguration in an external service, so there's a special incident state for that. For example, yesterday I forgot to verify our domain on Resend so some verification emails didn't go through. Superlog pinged me on Slack and explained where to go to fix.
Super glad you like the npx onboarding and the MCP tool :) Please keep the feedback coming!
Love the concept! Some feedback: I went to sign up to give it a go, but the set up process left me feeling a bit untrusting - so I backed out for now. I'd prefer more explanation about what to expect, what I will get, how it is safe, etc before asking me to run a prompt.
Thank you! Very good point.
Right now, the prompt will enumerate all the services and install the OpenTelemetry SDK (https://opentelemetry.io/) in each service.
Then for every service, the skill will make sure that:
- Every time something breaks and an operator needs to take a look, there's an error log - All important steps in a process emit info/debug logs (so that an issue can be investigated) - Operations are covered with spans with relevant attributes. - Cost (LLM tokens), API performance (latency/RED), tenant activity (cost/usage per tenant) are covered by metrics so that you can use Superlog MCP to build cool dashboards.
For most common stacks like NextJS, FastAPI, React Native/Expo etc. we have a custom skill that explains the best practices for this specific technology. For all the other stacks we ask the agent to use general best practices.
We have evals for all custom skills where we start from a starter project, run the agent with the skill and use LLM-as-a-judge to compare it to a human-written 'golden patch'.
In general, we try to:
- minimize diff, so that the instrumentation is easy to review - make small chunks of additive diffs vs huge indents / moving logic around - minimize new dependencies - use well-supported and audited OTel SDKs vs custom libs
You can read the skills here: https://github.com/superloglabs/skills.
I'll make sure to add this to our landing and print this out as the agent writes the code!
Thank you for the feedback!
on your pricing page:
> Start with one repo. Price the rest when the signal is real.
which makes it sound like possibly the $150/mo price is per-repo?
I think that could use some clarification - if I have 10 services in a monorepo vs 10 individual service repos, does that 10x my cost?
Very good point, thank you! Let me remove this phrase, you're right, it's misleading.
The pricing is only by usage (traces/logs/metrics) and investigation credits. We don't charge extra for repos :)
This is a very interesting idea and im excited to see where this goes. Congrats!
Thank you! :)
I would love to try it but I got stuck when it asked for Slack since I dont use that.
Hm, sorry about that!
I made the Slack onboarding step mandatory for now since we thought that a lot of our value was in sending investigations and PRs, and Slack was what we used ourselves.
What tool do you use for communication around your project? If you don't want to share publicly, could you please shoot a line to:
ash [at] superlog.sh?
Would love to learn about your usecase in more detail too!
Lots of places use various slack alternatives or teams/google.
For my current project I would use webhooks/email just like I do currently for my monitoring and alerting.
I don’t use Slack either. What about solo indie founders who don’t use “team communication”?
Got it! What channel would you prefer instead? Would Telegram/WhatsApp/Signal/iMessage be good?
The platform itself doesn't need Slack to function, we just observed that users got more value if they could get notifications somehow, so I'm more than happy to add more comms platforms :)
The typical issues I have seen with LLMs / Agents tend to be reactive in their fixes. So they tend to "patch" the symptom more than "fix" the root cause. Interested to see how you solve this problem.
You're right! It's a big issue and I don't think there's a silver bullet.
We have an eval suite with code+telemetry fixtures and a golden RCA+patches and an LLM-as-a-Judge. So whenever we get feedback from our users and they're OK with it, we use their feedback to create an eval case (it's still quite manual since you have to calibrate the case).
We use Superlog to observe Superlog, so I often extract cases from our own errors. The PRs get better and better, but, of course, it's sort of a continuous improvement process.
Any plans for an on-prem version?
What's your moat?
Sorry to be crude, but this sounds either dead on arrival, or at least needing a pivot, or a rephrasing of the pitch:
The moment something changes the system, it no longer observes it, in fact observing something might cause it to change ( https://en.wikipedia.org/wiki/Observer_effect_(physics) )
Either it's a tool for observing or it's a tool for fixing issues, it cannot be both, by physical principle.
Best case scenario here is that the product succeeds, and then you need to instrument the product itself in order to observe it, like debugging the debugger. But it wouldn't be an observability tool, it would shift the product that needs to be observed from the previous source code that is now a target language into the new source code that is now your product.
How does a grep or read affect the osberving system?
I guess the change in voltages, arrangement of registers, filling of buffers in the network stack are changing but... what?