For my own purposes I either restrict ollama's ports in the firewall, or I put some proxy in front of it that blocks access of some header with some predefined api key is not present. Kind of clunky, but it works.
That is unfortunate. Not because I think they should have to, but because they eventually will have to if it gets big enough. Never underestimate the ability of your users to hold it wrong.
The default install only binds to loopback, so I am sure it is pretty common to just slap OLLAMA_HOST=0.0.0.0 and move on to other things. I know I did at first, but my host isn't publicly routable and I went back the same night and added IPAddressDeny/Allow rules (among other standard/easy hardening).
Early MongoDB adapter here who still likes it. If your internal services are accessible from outside you are doing it wrong. Neither MongoDB nor ES or ollama are services that my applications would access through a public IP and whenever a dev asks me for access to the DB from the comfort of their home office I tell them what VPN to log into.
Even if those services had some access protection, I simply must assume that the service has some security leak that allows unauthorized access and the first line of defense against that is not having it on the public internet.
Decide what? Slapping a simple, naive login screen on top of a service that was never designed to fend off attacks from untrusted networks doesn't fix the actual issue, which is the fact that an administrator exercised bad judgement and made it accessible to untrusted networks.
On the flipside, you can also argue that if you are relying on network access to protect your internal services, you are doing it wrong. If the only thing you need to take over a service is access to its internal network, you are setting yourself up to be owned.
Yes but nobody is stopping you from adding your own proxy which enforces any type of authentication you like, and in my opinion that's the more sensible approach here anyway.
I don't think it's sensible to expect every project like Ollama to ship their own half-broken authentication and especially anything resembling a "zero trust" implementation. You can easily front Ollama with a reverse proxy which does those things if you'd like. Each component should do one thing well.
I trust Nginx to verify client certificates correctly so I can be confident that only traffic from trusted users is able to reach whatever insecure POS is hiding behind it.
Cisco does more than just sell equipment. Seeing this from their “threat intelligence research organization” shouldn’t be any more surprising than seeing the same from Google via Mandiant.
Having default passwords for a product that is designed to be connected to a network that the users are not forced to change is incomprehensible incompetent for any product produced the last 25 years.
Can't you just flip that and say that if you need there to be a default password, you shouldn't be using a Cisco product? And if nobody using a Cisco product needs a default password, then why does one exist at all?
Similarly a lot of projects using gradio come with a tunnel/public proxy enabled out of the box. ie instantly publicly accessible just by running it. Behind a long unique uuid looking url which provides some measure of security by obscurity but wow was still surprised first time I saw that.
Must be a good time to be in security space with this sort of stuff plus the inevitable vibe code security carnage
>each identified endpoint is programmatically queried to assess its security posture, with a particular focus on authentication and authorization mechanisms.
I know it's commonplace, but is this unauthorized access in terms of the CMA (UK) or CFAA (USA)?
The article itself appears to be largely AI-edited. And I'm really surprised that anyone would want to write an article on this, I assumed it was widely known? You can go onto Censys and find thousands of exposed instances for lots of self-hostable software, for LLM there are exposed instances of things like kobold, for image gen there's sd-webui, InvokeAI and more.
This has nothing to do with "everyone thinking they can code now", come on! People aren't asking cc to setup their cloud instances of ollama, they're likely getting a c/p line from a tutorial, just like they've always done.
What's likely happening here is that people are renting VMs and one-line some docker-compose up thing from a tutorial. And because it's a tutorial and people can't be bothered to tunnel their own traffic, most likely those tutorials are binding on 0.0.0.0.
Plenty of ways to footgun yourself with c/p something from a tutorial, even if you somewhat know what you're doing. No need to bring "everyone thinking they can code" into this. This is a tale as old as the Internet.
Another thing is that docker, being the helpful little thing that it is, in its default config will alter your firewall and open up ports even if you have a rule to drop everything you're not specifically using. So, yeah. That's probably what's happening.
I understand the concern here but isn't this the same as making any other type of server public? This is just regarding servers hosting LLMs, which I wouldn't even consider a huge security concern vs hosting a should-be-internal tool publicly.
Servers that shouldn't be made public are made public, a cyber tale as old as time.
> servers hosting LLMs, which I wouldn't even consider a huge security concern
The new problem is if the LLMs are connected to tooling.
There's been plenty of examples showing that with subtle changes to the prompt you can jailbreak the LLM to execute tooling in wildly different ways from what was intended.
They're trying to paper over this by having the LLM call regular code just so they can sure all steps of the workflow are actually executed reliably every time.
Even the same prompt can give different results depending on the temperate used. How security teams are able to sign these things off is beyond me.
The tools are client side operations in Ollama, so I don't see a way an attacker could use that to their benefit, except to leverage the actual computing power the server provides.
If any MCP servers are running, anyone with access to query the chat endpoint can use them. That could include file system access, GitHub tokens and more.
ollama can't connect to MCP servers, it can merely run models which output instructions back to a connected system to connect to an MCP server (e.g mcphost using ollama to run a prompt and then itself connecting to an MCP server if the response requires it).
The LLM endpoint via ollama or huggingface is not the one executing MCP tool calls, that is on behalf of the client that is interacting with the LLM. All the LLM does is take input as a prompt and produce a text output, that's it. Anything else is just a wrapper.
Yeah, I don't think most people who even run ollama would care. "Oh no, someone found my exposed instance, which means my computer in my bedroom is burning electricity for the past few hours. Oh well, I lost a few pennies in electricity." Shuts down Ollama on the computer.
Seriously, this is extremely mild as far as issues go. There's basically no incentive to fix this problem, because I bet even the people who lost a few pennies of electricity would still prefer the convenience of ollama not having auth.
Plus, that's the worst case scenario, in real life even if some black hat found an exposed ollama service, they have no interest in generating tokens for <insert random LLM here at 4 bit quant> at a slow speed of <50tok/sec.
If you think that's the worst case scenario you're in no position to be making security-related decisions. That line of thinking hinges on a very dangerous assumption that Ollama doesn't have any critical security vulnerabilities [1].
Don't expose services to the public internet unless they have been battle hardened to be exposed to the public internet, e.g. Nginx as an authenticating reverse proxy.
In general Go programs are quite secure against remote code execution kind class of attacks.
Even this one would be remedied by not running ollama as root and not have its binaries owned by the user it is running as (though overwriting executables/libraries that are being mmapped as executables is usually not possible), which I hope would be the standard mode of its setup.
I don't know why you would say that about Go, you're never more than one programming error away from creating a RCE vulnerability, no matter the language. Linked RCE should demonstrate that quite clearly, don't you think?
Either way my point is that software contains vulnerabilities, especially software that hasn't been hardened to be exposed to the public internet. Exposing it to the public internet anyway is a display of bad judgement, doubly so when the person responsible seems to believe that the worst thing that can happen is someone using the software as intended. Details of specific vulnerabilities are really beside the point here.
Assuming that the happy path is the worst that can happen is simply naive, there's no two ways about it.
As I understand it, overwhelmingly large majority of CVEs over the history of computing have been due to buffer overflows or use-after-free. If you leave out those vectors, you might actually be pretty close to having RCE-free piece of software.
But sure, it's always possible to be more innovative about how to go about enabling RCEs, like the log4j case demonstrates..
Another great use of a personal VPN - I work at https://www.defined.net (which uses Nebula as the underlying VPN technology) and also personally use our free tier (up to 100 hosts) for everything. Having my Ollama instances available only over my VPN overlay network is very slick.
Bad analogies are bad analogies. ollama is a server system, it should expect to connect with more than one client and they know very well by now that this also means networked clients. If you create a server client protocol, implementing security is your job.
Any decent router is going to block connections from internet to your local network by default. For ollama to be accessible from the outside, they had to allow it explicitly. There's no way to blame ollama for this.
I cannot express how deeply wrong you are about this; a "server system" is not some mandate that it should be production ready for a ton of people on the internet.
This is a program that very different people want or need to try out that just so happens to involve a client-server architecture.
As cynical as I am, I honestly don't think there is much to wonder about here. The initial product's adoption relied on low friction and minimal setup. That they wanted to keep it going as long as possible is just an extension of this.
Ollama doesn't run a web server that is "broadcasting across the internet". It runs a server that is accessible locally. You have to deliberately deploy it onto a public server in order for it to be accessible from the internet.
In all cases, having zero auth at all [0] even when others want to use it as a service to broadcast across the internet is ridiculous. Leading to problems like this: [1] and now all exposed without any protection.
Even allowing others to change the $OLLAMA_HOST env is a security footgun.
Apparently, protecting the API is not planned: https://github.com/ollama/ollama/issues/849
For my own purposes I either restrict ollama's ports in the firewall, or I put some proxy in front of it that blocks access of some header with some predefined api key is not present. Kind of clunky, but it works.
That is unfortunate. Not because I think they should have to, but because they eventually will have to if it gets big enough. Never underestimate the ability of your users to hold it wrong.
The default install only binds to loopback, so I am sure it is pretty common to just slap OLLAMA_HOST=0.0.0.0 and move on to other things. I know I did at first, but my host isn't publicly routable and I went back the same night and added IPAddressDeny/Allow rules (among other standard/easy hardening).
Yeah it’s a pretty crazy decision to be honest. Flashbacks to MongoDB and ElasticSearch’s early days.
Fortunately it’s an easy fix. Just front it with nginx or caddy and expect a bearer token (that would be your api key)
Early MongoDB adapter here who still likes it. If your internal services are accessible from outside you are doing it wrong. Neither MongoDB nor ES or ollama are services that my applications would access through a public IP and whenever a dev asks me for access to the DB from the comfort of their home office I tell them what VPN to log into.
Even if those services had some access protection, I simply must assume that the service has some security leak that allows unauthorized access and the first line of defense against that is not having it on the public internet.
Tell that to the kids at my high school in 2004 screwing with all the unprotected services across the whole school district-wide network.
Or the worms that scan for vulnerable services and install persistent threats.
If you want to remove the password on a service, that’s your choice. The default should have a password though and then people can decide.
Decide what? Slapping a simple, naive login screen on top of a service that was never designed to fend off attacks from untrusted networks doesn't fix the actual issue, which is the fact that an administrator exercised bad judgement and made it accessible to untrusted networks.
On the flipside, you can also argue that if you are relying on network access to protect your internal services, you are doing it wrong. If the only thing you need to take over a service is access to its internal network, you are setting yourself up to be owned.
Yes but nobody is stopping you from adding your own proxy which enforces any type of authentication you like, and in my opinion that's the more sensible approach here anyway.
I don't think it's sensible to expect every project like Ollama to ship their own half-broken authentication and especially anything resembling a "zero trust" implementation. You can easily front Ollama with a reverse proxy which does those things if you'd like. Each component should do one thing well.
I trust Nginx to verify client certificates correctly so I can be confident that only traffic from trusted users is able to reach whatever insecure POS is hiding behind it.
You are assuming the only threats can come from outside.
Defense in depth is essential in an age of unreliable software supply chain.
I would say it is reasonable decision as fronting with proxy is quite good approach. Unfortunately lots of non tech people want to “just run it”
You can easily protect the api with nginx basic auth
I don’t think proxy is clunky. I would expect that should be quite fine solution.
Problem is people don’t know that it’s a good solution.
Correction: ...blocks access IF some header...
I’d expect Cisco to publish an article on thousands of Cisco devices with default passwords still there in the open.
Definitely not credible to speak about ML stuff and of course - Ollama has never been production-ready in the sense iOS (Cisco’s) was.
Cisco does more than just sell equipment. Seeing this from their “threat intelligence research organization” shouldn’t be any more surprising than seeing the same from Google via Mandiant.
How is it Cisco’s fault that a lot of network administrators are incompetent and don’t change default passwords?
Having default passwords for a product that is designed to be connected to a network that the users are not forced to change is incomprehensible incompetent for any product produced the last 25 years.
If you need to be forced to change the default password on Cisco products you probably shouldn’t be using them.
Can't you just flip that and say that if you need there to be a default password, you shouldn't be using a Cisco product? And if nobody using a Cisco product needs a default password, then why does one exist at all?
Not allowing default passwords is for the greater good. Making it harder to install is a feature in this case.
Cisco is incredibly (in)famous for having hardcoded backdoor accounts in their products.
By forcing them to change the defaults, like Ubiquiti does, for instance.
Yes
I can think of no reason to be surprised by this, except that Cisco is the one reporting it. That part is surprising.
My exact thoughts. Very bad form by Cisco.
[dead]
Shodan also has built-in detection for some of them. For example, you can search for "product:ollama" (https://www.shodan.io/search?query=product%3Aollama). Or if you have access to the tag filter then simply "tag:ai" (https://www.shodan.io/search/report?query=tag%3Aai).
Similarly a lot of projects using gradio come with a tunnel/public proxy enabled out of the box. ie instantly publicly accessible just by running it. Behind a long unique uuid looking url which provides some measure of security by obscurity but wow was still surprised first time I saw that.
Must be a good time to be in security space with this sort of stuff plus the inevitable vibe code security carnage
> Behind a long unique uuid looking url which provides some measure of security by obscurity
That's not security by obscurity.
If the "uuid looking" part is generated using a csprng and has enough entropy, it has the same security properties as any other secret.
There's other issues with having the secret in the URL.
Not when the user leaks their DNS query it doesn't. Those endpoints must be one of the dumbest "vibe security" ideas I've literally ever heard of.
>each identified endpoint is programmatically queried to assess its security posture, with a particular focus on authentication and authorization mechanisms.
I know it's commonplace, but is this unauthorized access in terms of the CMA (UK) or CFAA (USA)?
The article itself appears to be largely AI-edited. And I'm really surprised that anyone would want to write an article on this, I assumed it was widely known? You can go onto Censys and find thousands of exposed instances for lots of self-hostable software, for LLM there are exposed instances of things like kobold, for image gen there's sd-webui, InvokeAI and more.
Why are people running ollama on public servers.
Is this thanks to everyone thinking they can code now and not understanding what they’re doing.
Make it make sense
This has nothing to do with "everyone thinking they can code now", come on! People aren't asking cc to setup their cloud instances of ollama, they're likely getting a c/p line from a tutorial, just like they've always done.
What's likely happening here is that people are renting VMs and one-line some docker-compose up thing from a tutorial. And because it's a tutorial and people can't be bothered to tunnel their own traffic, most likely those tutorials are binding on 0.0.0.0.
Plenty of ways to footgun yourself with c/p something from a tutorial, even if you somewhat know what you're doing. No need to bring "everyone thinking they can code" into this. This is a tale as old as the Internet.
Another thing is that docker, being the helpful little thing that it is, in its default config will alter your firewall and open up ports even if you have a rule to drop everything you're not specifically using. So, yeah. That's probably what's happening.
I understand the concern here but isn't this the same as making any other type of server public? This is just regarding servers hosting LLMs, which I wouldn't even consider a huge security concern vs hosting a should-be-internal tool publicly.
Servers that shouldn't be made public are made public, a cyber tale as old as time.
> servers hosting LLMs, which I wouldn't even consider a huge security concern
The new problem is if the LLMs are connected to tooling.
There's been plenty of examples showing that with subtle changes to the prompt you can jailbreak the LLM to execute tooling in wildly different ways from what was intended.
They're trying to paper over this by having the LLM call regular code just so they can sure all steps of the workflow are actually executed reliably every time.
Even the same prompt can give different results depending on the temperate used. How security teams are able to sign these things off is beyond me.
The tools are client side operations in Ollama, so I don't see a way an attacker could use that to their benefit, except to leverage the actual computing power the server provides.
The stakes aren’t that high yet for Ollama to warrant cumbersome auth mechanisms.
If any MCP servers are running, anyone with access to query the chat endpoint can use them. That could include file system access, GitHub tokens and more.
ollama can't connect to MCP servers, it can merely run models which output instructions back to a connected system to connect to an MCP server (e.g mcphost using ollama to run a prompt and then itself connecting to an MCP server if the response requires it).
The LLM endpoint via ollama or huggingface is not the one executing MCP tool calls, that is on behalf of the client that is interacting with the LLM. All the LLM does is take input as a prompt and produce a text output, that's it. Anything else is just a wrapper.
That is is completely false, ollama has nothing to do with running commands, it just processes prompts to text responses.
Yeah, I don't think most people who even run ollama would care. "Oh no, someone found my exposed instance, which means my computer in my bedroom is burning electricity for the past few hours. Oh well, I lost a few pennies in electricity." Shuts down Ollama on the computer.
Seriously, this is extremely mild as far as issues go. There's basically no incentive to fix this problem, because I bet even the people who lost a few pennies of electricity would still prefer the convenience of ollama not having auth.
Plus, that's the worst case scenario, in real life even if some black hat found an exposed ollama service, they have no interest in generating tokens for <insert random LLM here at 4 bit quant> at a slow speed of <50tok/sec.
If you think that's the worst case scenario you're in no position to be making security-related decisions. That line of thinking hinges on a very dangerous assumption that Ollama doesn't have any critical security vulnerabilities [1].
Don't expose services to the public internet unless they have been battle hardened to be exposed to the public internet, e.g. Nginx as an authenticating reverse proxy.
[1] https://github.com/advisories/GHSA-vq2g-prvr-rgr4
In general Go programs are quite secure against remote code execution kind class of attacks.
Even this one would be remedied by not running ollama as root and not have its binaries owned by the user it is running as (though overwriting executables/libraries that are being mmapped as executables is usually not possible), which I hope would be the standard mode of its setup.
I don't know why you would say that about Go, you're never more than one programming error away from creating a RCE vulnerability, no matter the language. Linked RCE should demonstrate that quite clearly, don't you think?
Either way my point is that software contains vulnerabilities, especially software that hasn't been hardened to be exposed to the public internet. Exposing it to the public internet anyway is a display of bad judgement, doubly so when the person responsible seems to believe that the worst thing that can happen is someone using the software as intended. Details of specific vulnerabilities are really beside the point here.
Assuming that the happy path is the worst that can happen is simply naive, there's no two ways about it.
As I understand it, overwhelmingly large majority of CVEs over the history of computing have been due to buffer overflows or use-after-free. If you leave out those vectors, you might actually be pretty close to having RCE-free piece of software.
But sure, it's always possible to be more innovative about how to go about enabling RCEs, like the log4j case demonstrates..
Is that agency over yourself called vibe living?
That is assuming you cannot exploit the server to get access to the machine...
largely the fault of n8n
Cheap (almost free) highly parallel inference. Nice!
Free inference. Yay
how many ppl are using Ollama in production though
"Our study uncovered over 1,100 exposed Ollama servers, with approximately 20% actively hosting models susceptible to unauthorized access."
So at least 1,100.
I'm surprised Shodan is legal. Just because someone made a mistake when setting up their network doesn't mean you're authorized.
Shodan? Like from system shock?
That's where the name comes from. It's a search engine for finding servers exposed to the public.
Another great use of a personal VPN - I work at https://www.defined.net (which uses Nebula as the underlying VPN technology) and also personally use our free tier (up to 100 hosts) for everything. Having my Ollama instances available only over my VPN overlay network is very slick.
Ollama has no auth mechanism by default... You have to wonder why they never focused on that
Separation of concerns?
If you deploy a power plug outside your house, is it the fault of the power plug designer if people steal your power?
Put it behind a webserver with basic auth or whatever you fancy, done.
Bad analogies are bad analogies. ollama is a server system, it should expect to connect with more than one client and they know very well by now that this also means networked clients. If you create a server client protocol, implementing security is your job.
Any decent router is going to block connections from internet to your local network by default. For ollama to be accessible from the outside, they had to allow it explicitly. There's no way to blame ollama for this.
Lots of servers do not, Redis for instance does not have auth by default, and IIRC did not have auth at all for a long time.
> If you create a server client protocol, implementing security is your job.
Yes, this goes right along with the tried and true Unix philosophy: do everything, poorly. Wait what?
I cannot express how deeply wrong you are about this; a "server system" is not some mandate that it should be production ready for a ton of people on the internet.
This is a program that very different people want or need to try out that just so happens to involve a client-server architecture.
The client-server pattern is frequently used locally.
As cynical as I am, I honestly don't think there is much to wonder about here. The initial product's adoption relied on low friction and minimal setup. That they wanted to keep it going as long as possible is just an extension of this.
The dockerd TCP socket has no auth mechanism by default... You have to wonder why they never focused on that.
I don’t think it was intended for production workloads.
Should have asked an LLM to write one.
[flagged]
Ollama doesn't run a web server that is "broadcasting across the internet". It runs a server that is accessible locally. You have to deliberately deploy it onto a public server in order for it to be accessible from the internet.
In all cases, having zero auth at all [0] even when others want to use it as a service to broadcast across the internet is ridiculous. Leading to problems like this: [1] and now all exposed without any protection.
Even allowing others to change the $OLLAMA_HOST env is a security footgun.
[0] https://github.com/ollama/ollama/issues/849
[1] https://www.wiz.io/blog/probllama-ollama-vulnerability-cve-2...
The idea is that you add an auth layer if that's what you want to do.
The majority of Ollama users at the moment are likely hobbyists working in single-user contexts.
For those who want to deploy it in an organizational setting, it's straightforward to put it behind a pre-existing authenticaton system.