Looks like there's one feature missing from this that I care about: I'd like more finely grained control over what outbound internet connections code running on the box can make.
As far as I can tell it's all or nothing right now:
I want to run untrusted code (from users or LLMs) in these containers, and I'd like to avoid someone malicious using my container to launch attacks against other sites from them.
As such, I'd like to be able to allow-list just specific network points. Maybe I'm OK with the container talking to an API I provide but not to the world at wide. Or perhaps I'm OK with it fetching data from npm and PyPI but I don't want it to be able to access anything else (a common pattern these days, e.g. Claude's Code Interpreter does this.)
You may be interested in the Dynamic Worker Loader API, which lets you set up isolate-based sandboxes (instead of containers) and gives you extremely fine-grained, object-capability-based control over permissions.
It was announced as part of the code mode blog post:
This simple feature bumps up the complexity of such a firewall by several orders of magnitude, which is why no similar runtime (like Deno) offers it.
Networking as a whole can easily be controlled by the OS or any intermediate layer. For controlling access to specific sites you need to either filter it at the DNS level, which can be trivially bypassed, or bake something into the application binary itself. But if you are enabling untrusted code and giving that code access to a TCP channel then it is effectively impossible to restrict what it can or cannot access.
The most convincing implementation I've seen of this so far is to lock down access to just a single IP address, then run an HTTP proxy server at that IP address which can control what sites can be proxied to.
Then inject HTTP_PROXY and HTTPS_PROXY environment variables so tools running in the sandbox know what to use.
That’s true, but Cloudflare is uniquely positioned to avoid this complexity by leveraging the functionality of all their existing products. For them, sandboxing the network is probably the easiest problem to solve for this product…
At least on macOS, there is a third way where you can control the network connection on the PID/binary level by setting up a network system extension and then setting up a content filter so you can allow/deny requests. It is pretty trivial to set this up, but the real challenge is usually in how you want to express your rules.
We rolled out our own that does pretty much the same thing but perhaps more because our solution can also mount persistent storage that can be carried between multiple runners. It does take 1-5 seconds to boot the environment (firecracker vms). If this sandbox is faster I will instruct the team to consider for fast starup.
This is also very similar to Vercel's sandbox thing. The same technology?
What I don't like about this approach is the github repo bootstrap setup. Is it more convenient compared to docker images pushed to some registry? Perhaps. But docker benefits from having all the artefacts prebuilt in advance, which in our case is quite a bit.
1-5 seconds seems high for Firecracker, depending on your requirements.
We boot VMs (using Firecracker) at ~20-50ms.
Obviously depending on the base image/overlay/etc., your system might need resources making it a network-bound boot, but based on what you've said it seems you should be able to make your system much faster!
When your agent performs 20 tasks saving seconds here and there becomes a very big deal. I cannot even begin to describe how much time we've spent on optimising code paths to make the overall execution fast.
Last week I was on a call with a customer. They where running OpenAI side-by-side with our solution. I was pleased that we managed to fulfil the request under a minute while OpenAI took 4.5 minutes.
The LLM is not the biggest contributor to latency in my opinion.
Thanks! While I agree with you on "saving seconds" and overall latency argument, according to my understanding, most agentic use cases are asynchronous and VM boot up time may just be a tiny fraction of overall task execution time (e.g., deep research and similar long running tasks in the background).
I browsed through the documents but it does not seem to be possible to auto destroy a sandbox after certain amount of idle time. This forces who ever is implementing this to do their own cleanup. It is kind of missed opportunity if you ask me as this is a big pain. It is sold as fire and forget but it seems that more serious workflows will require also a lot of supporting infrastructure.
To me, the docs answer it pretty clearly. The defined directories persist until you destroy().
The part that's unclear to me is how billing works for a sandbox's disk that's asleep, because container disks are ephemeral and don't survive sleep[2] but the sandbox pricing points you to containers which says "Charges stop after the container instance goes to sleep".
Cloudflare Containers (and therefore Sandbox) pricing is way too expensive. The pricing is a bit cumbersome to understand by being inconsistent with pricing of other Cloudflare products in terms of units and split between memory, cpu and disk instead of combined per instance. The worst is that it is given in these tiny fractions per second.
Memory: $0.0000025 per additional GiB-second
vCPU: $0.000020 per additional vCPU-second
Disk: $0.00000007 per additional GB-second
The smaller instance types have super low processing power by getting a fraction of a vCPU. But if you calculate the monthly cost then it comes to:
Memory: $6.48 per GB
vCPU: $51.84 per vCPU (!!!)
Disk: $0.18 per GB
These prices are more expensive than the already expensive prices of the big cloud providers. For example a t2d-standard-2 on GCP with 2 vCPUs and 8GB with 16GB storage would cost $63.28 per month while the standard-3 instance on CF would cost a whopping $51.84 + $103.68 + $2.90 = $158.42, about 2.5x the price.
Cloudflare Containers also don't have peristent storage and are by design intended to shut down if not used but I could then also go for a spot vm on GCP which would bring the price down to $9.27 which is less than 6% of the CF container cost and I get persistent storage plus a ton of other features on top.
Cloudflare containers feel a lot more pricey as compared to workers but I think that it could provide more streamlined experience imo but still, If we are talking about complete cost analysis, sometimes I wonder how much cf containers vs workers vs hetzner/dedicated/shared vps / gcp etc. would work out for the same thing.
Honestly, the more I think about it, my ease of sanity either wants me to use hetzner/others for golang/other binary related stuff and for the frontend to use cf workers with sveltekit
That way we could have the best in both worlds and probably glue together somethings using proto-buf or something but I guess people don't like managing two codebases but I think that sveltekit is a pleasure to work with and can easily be learnt by anybody in 3-4 weeks and maybe some more for golang but yeah I might look more into cf containers/gcp or whatever but my heart wants hetzner for backend with golang if need be and to try to extract as much juice as I can in cf workers with sveltekit in the meanwhile.
These CF website relaunches are just that right? Workers last week (https://workers.cloudflare.com) and now this one yesterday. I mean, if CF has something newsworthy here they should do a blog post announcing it because otherwise it's just a refreshed website. It's hard to tell if there's anything new here.
it barely had any features then, this version is full of new functionality: streaming logs, long running processes, code interpreter and lots of other things and full docs site as well
Looks like there's one feature missing from this that I care about: I'd like more finely grained control over what outbound internet connections code running on the box can make.
As far as I can tell it's all or nothing right now:
I want to run untrusted code (from users or LLMs) in these containers, and I'd like to avoid someone malicious using my container to launch attacks against other sites from them.As such, I'd like to be able to allow-list just specific network points. Maybe I'm OK with the container talking to an API I provide but not to the world at wide. Or perhaps I'm OK with it fetching data from npm and PyPI but I don't want it to be able to access anything else (a common pattern these days, e.g. Claude's Code Interpreter does this.)
Cloudflare has Outbound Workers for exactly this use-case: https://developers.cloudflare.com/cloudflare-for-platforms/w...
If these aren't enabled for containers / sandboxes yet, I bet they will be soon
You may be interested in the Dynamic Worker Loader API, which lets you set up isolate-based sandboxes (instead of containers) and gives you extremely fine-grained, object-capability-based control over permissions.
It was announced as part of the code mode blog post:
https://blog.cloudflare.com/code-mode/
I’m extending Packj sandbox for agentic code execution [1]. You can specify allowlist for network/fs.
1. https://github.com/ossillate-inc/packj/blob/main/packj/sandb...
This simple feature bumps up the complexity of such a firewall by several orders of magnitude, which is why no similar runtime (like Deno) offers it.
Networking as a whole can easily be controlled by the OS or any intermediate layer. For controlling access to specific sites you need to either filter it at the DNS level, which can be trivially bypassed, or bake something into the application binary itself. But if you are enabling untrusted code and giving that code access to a TCP channel then it is effectively impossible to restrict what it can or cannot access.
The most convincing implementation I've seen of this so far is to lock down access to just a single IP address, then run an HTTP proxy server at that IP address which can control what sites can be proxied to.
Then inject HTTP_PROXY and HTTPS_PROXY environment variables so tools running in the sandbox know what to use.
That’s true, but Cloudflare is uniquely positioned to avoid this complexity by leveraging the functionality of all their existing products. For them, sandboxing the network is probably the easiest problem to solve for this product…
At least on macOS, there is a third way where you can control the network connection on the PID/binary level by setting up a network system extension and then setting up a content filter so you can allow/deny requests. It is pretty trivial to set this up, but the real challenge is usually in how you want to express your rules.
Little Snitch does this pretty well: https://www.obdev.at/products/littlesnitch/index.html
deno does support per-host network permissions https://docs.deno.com/runtime/fundamentals/security/#network...
You cannot bypass DNS within Cloudflare’s environment.
What does that mean? That's essentially like saying "you cannot bypass HTTP" within Cloudflare's environment. It doesn't make any sense.
Do you mean they force you to use their DNS? What about DOH(s)? What about just skipping domain lookup entirely and using a raw IP address?
If anyone is curious, more details on our SDK can be found here actually https://github.com/cloudflare/sandbox-sdk
Mind answering the question here: https://news.ycombinator.com/item?id=45611301 ?
Is there some sort of competition for awful looking websites going on?
This bizarre anti-aesthetic has been pushed in the web devex space for a few years now to appeal to other web devex companies.
I thought it was cute and easy to read.
They didn't test it with FF apparently.
Looks perfectly fine in FF 144.0 on Mac OS.
Looks nice.
We rolled out our own that does pretty much the same thing but perhaps more because our solution can also mount persistent storage that can be carried between multiple runners. It does take 1-5 seconds to boot the environment (firecracker vms). If this sandbox is faster I will instruct the team to consider for fast starup.
This is also very similar to Vercel's sandbox thing. The same technology?
What I don't like about this approach is the github repo bootstrap setup. Is it more convenient compared to docker images pushed to some registry? Perhaps. But docker benefits from having all the artefacts prebuilt in advance, which in our case is quite a bit.
1-5 seconds seems high for Firecracker, depending on your requirements.
We boot VMs (using Firecracker) at ~20-50ms.
Obviously depending on the base image/overlay/etc., your system might need resources making it a network-bound boot, but based on what you've said it seems you should be able to make your system much faster!
> It does take 1-5 seconds to boot the environment (firecracker vms).
I'd say 1-5 secs is fast. Curious to know what use cases require faster boot up, and today suffer from this latency?
When your agent performs 20 tasks saving seconds here and there becomes a very big deal. I cannot even begin to describe how much time we've spent on optimising code paths to make the overall execution fast.
Last week I was on a call with a customer. They where running OpenAI side-by-side with our solution. I was pleased that we managed to fulfil the request under a minute while OpenAI took 4.5 minutes.
The LLM is not the biggest contributor to latency in my opinion.
Thanks! While I agree with you on "saving seconds" and overall latency argument, according to my understanding, most agentic use cases are asynchronous and VM boot up time may just be a tiny fraction of overall task execution time (e.g., deep research and similar long running tasks in the background).
I browsed through the documents but it does not seem to be possible to auto destroy a sandbox after certain amount of idle time. This forces who ever is implementing this to do their own cleanup. It is kind of missed opportunity if you ask me as this is a big pain. It is sold as fire and forget but it seems that more serious workflows will require also a lot of supporting infrastructure.
You can easily set an alarm in the durable object to check if it should be killed and then call destroy yourself. Just a couple lines of code.
Nice. Thanks for the tip. I did not know that this was a thing. I will look it up.
There is an open question about how file persistence works.
The docs claim they persist the filesystem even when they move the container to an idle state but its unclear exactly what that means - https://github.com/cloudflare/sandbox-sdk/issues/102
To me, the docs answer it pretty clearly. The defined directories persist until you destroy().
The part that's unclear to me is how billing works for a sandbox's disk that's asleep, because container disks are ephemeral and don't survive sleep[2] but the sandbox pricing points you to containers which says "Charges stop after the container instance goes to sleep".
https://developers.cloudflare.com/sandbox/concepts/sandboxes...
https://developers.cloudflare.com/sandbox/concepts/sandboxes...
[2] https://developers.cloudflare.com/containers/faq/#is-disk-pe...
Yeah thats basically the issue. If container disks are ephemeral, how are they persisting it? And however they are doing it, whats the billing for it?
This looks interesting.
Instead of having to code this up using typescript, is there an MCP server or API endpoint I can use?
Basically, I want to connect an MCP server to an agent, tell it it can run typescript code in order to solve a problem or verify something.
Cloudflare Containers (and therefore Sandbox) pricing is way too expensive. The pricing is a bit cumbersome to understand by being inconsistent with pricing of other Cloudflare products in terms of units and split between memory, cpu and disk instead of combined per instance. The worst is that it is given in these tiny fractions per second.
Memory: $0.0000025 per additional GiB-second vCPU: $0.000020 per additional vCPU-second Disk: $0.00000007 per additional GB-second
The smaller instance types have super low processing power by getting a fraction of a vCPU. But if you calculate the monthly cost then it comes to:
Memory: $6.48 per GB vCPU: $51.84 per vCPU (!!!) Disk: $0.18 per GB
These prices are more expensive than the already expensive prices of the big cloud providers. For example a t2d-standard-2 on GCP with 2 vCPUs and 8GB with 16GB storage would cost $63.28 per month while the standard-3 instance on CF would cost a whopping $51.84 + $103.68 + $2.90 = $158.42, about 2.5x the price.
Cloudflare Containers also don't have peristent storage and are by design intended to shut down if not used but I could then also go for a spot vm on GCP which would bring the price down to $9.27 which is less than 6% of the CF container cost and I get persistent storage plus a ton of other features on top.
What am I missing?
It doesn't really make sense to compare this to regular VM pricing I think.
This is a on-demand managed container service with a convenient API, logging, global placement in 300+ locations, ...
AWS Lambda is probably closer in terms of product match. (sans the autoscaling)
Depending on what you do , Sandbox could be roughly on par with Lambda, or considerably cheaper.
The 1TB of included egress alone would be like 90$ on AWS.
Of course on lambda you pay per request. But you also apparently pay for Cloudflare Worker requests with Sandbox...
I reckon ... it's complicated.
Cloudflare containers feel a lot more pricey as compared to workers but I think that it could provide more streamlined experience imo but still, If we are talking about complete cost analysis, sometimes I wonder how much cf containers vs workers vs hetzner/dedicated/shared vps / gcp etc. would work out for the same thing.
Honestly, the more I think about it, my ease of sanity either wants me to use hetzner/others for golang/other binary related stuff and for the frontend to use cf workers with sveltekit
That way we could have the best in both worlds and probably glue together somethings using proto-buf or something but I guess people don't like managing two codebases but I think that sveltekit is a pleasure to work with and can easily be learnt by anybody in 3-4 weeks and maybe some more for golang but yeah I might look more into cf containers/gcp or whatever but my heart wants hetzner for backend with golang if need be and to try to extract as much juice as I can in cf workers with sveltekit in the meanwhile.
Thoughts on my stack?
Startups would build on big tech, so are likely to add their margins. Have you looked into (bulk) discounts from GCP/AWS?
Does this relate to workerd in any way or is it something else entirely?
How much `power` do they have?
These CF website relaunches are just that right? Workers last week (https://workers.cloudflare.com) and now this one yesterday. I mean, if CF has something newsworthy here they should do a blog post announcing it because otherwise it's just a refreshed website. It's hard to tell if there's anything new here.
It's the same SDK stuff from earlier this year right? https://developers.cloudflare.com/changelog/2025-06-24-annou...
There’s also the changelog https://developers.cloudflare.com/changelog/
it barely had any features then, this version is full of new functionality: streaming logs, long running processes, code interpreter and lots of other things and full docs site as well