Show HN: Webctl – Browser automation for agents based on CLI instead of MCP

(github.com)

57 points | by cosinusalpha 11 hours ago ago

17 comments

At this point I'm fully down the path of the agent just maintaining his own tools. I have a browser skill that continues to evolve as I use it. Beats every alternative I have tried so far.

[-]

kinduff an hour ago

whats the name of the skill?

randito 3 hours ago

If you look at Elixir keynote for Phoenix.new -- a cool agentic coding tool -- you'll see some hints about a browser control using a API tool call. It's called "web" in the video.

Video: https://youtu.be/ojL_VHc4gLk?t=2132

More discussion: https://simonwillison.net/2025/Jun/23/phoenix-new/

binalpatel 4 hours ago

Cool to see lots of people independently come to "CLIs are all you need". I'm still not sure if it's a short-term bandaid because agents are so good at terminal use or if it's part of a longer term trend but it's definitely felt much more seamless to me then MCPs.

(my one of many contribution https://github.com/caesarnine/binsmith)

[-]

cosinusalpha 2 hours ago

I am also not sure if MCP will eventually be fixed to allow more control over context, or if the CLI approach really is the future for Agentic AI.

Nevertheless, I prefer the CLI for other reasons: it is built for humans and is much easier to debug.

0x696C6961 2 hours ago

MCP let's you hide secrets from the LLM

[-]

pylotlight 2 hours ago

you can do same thing with cli via env vars no?

desireco42 2 hours ago

Hey this looks cool. So each agent or session is one thread. Nice. I like it.

renegat0x0 4 hours ago

A little bit different, but also allows to scrape efficiently. Json http communication rather than cli.

https://github.com/rumca-js/crawler-buddy

More like a framework for other mechanisms

philipbjorge 5 hours ago

This looks remarkably similar to https://github.com/vercel-labs/agent-browser

How is it different?

[-]

cosinusalpha 2 hours ago

To be honest, I hadn't seen that one yet!

The main difference is likely the targeting philosophy. webctl relies heavily on ARIA roles/semantics (e.g. role=button name="Save") rather than injected IDs or CSS selectors. I find this makes the automation much more robust to UI changes.

Also, I went with Python for V1 simply for iteration speed and ecosystem integration. I'd love to rewrite in Rust eventually, but Python was the most efficient way to get a stable tool working for my specific use case.

hugs 4 hours ago

vibium clicker, too. https://github.com/VibiumDev/vibium/blob/main/CONTRIBUTING.m...

"browser automation for ai agents" is a popular idea these days.

grigio 3 hours ago

is there a benchmark? there are a lot of scraping agents nowdays..

[-]

cosinusalpha 2 hours ago

I don't have an objective benchmark yet. I tried several existing solutions, especially the MCP servers for browser automation, and none of them were able to reproducibly solve my specific task.

An objective benchmark is a great idea, especially to compare webctl against other similar CLI-based tools. I'll definitely look into how to set that up.

desireco42 2 hours ago

How are you holding session if every command is issues through cli? I assume this is essential for automation.

[-]

cosinusalpha 2 hours ago

A background daemon holds the session state between different CLI calls. This daemon is started automatically on the first webctl call and auto-closes after a timeout period of inactivity to save resources.

[-]

desireco42 an hour ago

I see, nice. Is there a way to run multiple sessions?