Finding thousands of exposed Ollama instances using Shodan

(blogs.cisco.com)

159 points | by rldjbpin 3 days ago ago

75 comments

Apparently, protecting the API is not planned: https://github.com/ollama/ollama/issues/849

For my own purposes I either restrict ollama's ports in the firewall, or I put some proxy in front of it that blocks access of some header with some predefined api key is not present. Kind of clunky, but it works.

[-]

time0ut 2 days ago

That is unfortunate. Not because I think they should have to, but because they eventually will have to if it gets big enough. Never underestimate the ability of your users to hold it wrong.

The default install only binds to loopback, so I am sure it is pretty common to just slap OLLAMA_HOST=0.0.0.0 and move on to other things. I know I did at first, but my host isn't publicly routable and I went back the same night and added IPAddressDeny/Allow rules (among other standard/easy hardening).

omneity 2 days ago

Yeah it’s a pretty crazy decision to be honest. Flashbacks to MongoDB and ElasticSearch’s early days.

Fortunately it’s an easy fix. Just front it with nginx or caddy and expect a bearer token (that would be your api key)

[-]

TomK32 2 days ago

Early MongoDB adapter here who still likes it. If your internal services are accessible from outside you are doing it wrong. Neither MongoDB nor ES or ollama are services that my applications would access through a public IP and whenever a dev asks me for access to the DB from the comfort of their home office I tell them what VPN to log into.

Even if those services had some access protection, I simply must assume that the service has some security leak that allows unauthorized access and the first line of defense against that is not having it on the public internet.

[-]

harrall 2 days ago

Tell that to the kids at my high school in 2004 screwing with all the unprotected services across the whole school district-wide network.

Or the worms that scan for vulnerable services and install persistent threats.

If you want to remove the password on a service, that’s your choice. The default should have a password though and then people can decide.

[-]

dns_snek 2 days ago

Decide what? Slapping a simple, naive login screen on top of a service that was never designed to fend off attacks from untrusted networks doesn't fix the actual issue, which is the fact that an administrator exercised bad judgement and made it accessible to untrusted networks.

cortesoft 2 days ago

On the flipside, you can also argue that if you are relying on network access to protect your internal services, you are doing it wrong. If the only thing you need to take over a service is access to its internal network, you are setting yourself up to be owned.

[-]

dns_snek 2 days ago

Yes but nobody is stopping you from adding your own proxy which enforces any type of authentication you like, and in my opinion that's the more sensible approach here anyway.

I don't think it's sensible to expect every project like Ollama to ship their own half-broken authentication and especially anything resembling a "zero trust" implementation. You can easily front Ollama with a reverse proxy which does those things if you'd like. Each component should do one thing well.

I trust Nginx to verify client certificates correctly so I can be confident that only traffic from trusted users is able to reach whatever insecure POS is hiding behind it.

omneity 2 days ago

You are assuming the only threats can come from outside.

Defense in depth is essential in an age of unreliable software supply chain.

ozim 2 days ago

I would say it is reasonable decision as fronting with proxy is quite good approach. Unfortunately lots of non tech people want to “just run it”

kaptainscarlet 2 days ago

You can easily protect the api with nginx basic auth

ozim 2 days ago

I don’t think proxy is clunky. I would expect that should be quite fine solution.

Problem is people don’t know that it’s a good solution.

alexandru_m 2 days ago

Correction: ...blocks access IF some header...

larodi 2 days ago

I’d expect Cisco to publish an article on thousands of Cisco devices with default passwords still there in the open.

Definitely not credible to speak about ML stuff and of course - Ollama has never been production-ready in the sense iOS (Cisco’s) was.

[-]

Den_VR 2 days ago

Cisco does more than just sell equipment. Seeing this from their “threat intelligence research organization” shouldn’t be any more surprising than seeing the same from Google via Mandiant.

dlachausse 2 days ago

How is it Cisco’s fault that a lot of network administrators are incompetent and don’t change default passwords?

[-]

msh 2 days ago

Having default passwords for a product that is designed to be connected to a network that the users are not forced to change is incomprehensible incompetent for any product produced the last 25 years.

[-]

dlachausse 2 days ago

If you need to be forced to change the default password on Cisco products you probably shouldn’t be using them.

[-]

lupusreal 2 days ago

Can't you just flip that and say that if you need there to be a default password, you shouldn't be using a Cisco product? And if nobody using a Cisco product needs a default password, then why does one exist at all?

cactusplant7374 2 days ago

Not allowing default passwords is for the greater good. Making it harder to install is a feature in this case.

maweki 2 days ago

Cisco is incredibly (in)famous for having hardcoded backdoor accounts in their products.

jamesnorden 2 days ago

By forcing them to change the defaults, like Ubiquiti does, for instance.

more_corn 2 days ago

Yes

thevinchi 2 days ago

I can think of no reason to be surprised by this, except that Cisco is the one reporting it. That part is surprising.

[-]

iJohnDoe 2 days ago

My exact thoughts. Very bad form by Cisco.

[-]

k4rnaj1k 2 days ago

[dead]

achillean 2 days ago

Shodan also has built-in detection for some of them. For example, you can search for "product:ollama" (https://www.shodan.io/search?query=product%3Aollama). Or if you have access to the tag filter then simply "tag:ai" (https://www.shodan.io/search/report?query=tag%3Aai).

Havoc 2 days ago

Similarly a lot of projects using gradio come with a tunnel/public proxy enabled out of the box. ie instantly publicly accessible just by running it. Behind a long unique uuid looking url which provides some measure of security by obscurity but wow was still surprised first time I saw that.

Must be a good time to be in security space with this sort of stuff plus the inevitable vibe code security carnage

[-]

ahtihn 2 days ago

> Behind a long unique uuid looking url which provides some measure of security by obscurity

That's not security by obscurity.

If the "uuid looking" part is generated using a csprng and has enough entropy, it has the same security properties as any other secret.

There's other issues with having the secret in the URL.

[-]

oceanplexian a day ago

Not when the user leaks their DNS query it doesn't. Those endpoints must be one of the dumbest "vibe security" ideas I've literally ever heard of.

pbhjpbhj 2 days ago

>each identified endpoint is programmatically queried to assess its security posture, with a particular focus on authentication and authorization mechanisms.

I know it's commonplace, but is this unauthorized access in terms of the CMA (UK) or CFAA (USA)?

Tiberium 2 days ago

The article itself appears to be largely AI-edited. And I'm really surprised that anyone would want to write an article on this, I assumed it was widely known? You can go onto Censys and find thousands of exposed instances for lots of self-hostable software, for LLM there are exposed instances of things like kobold, for image gen there's sd-webui, InvokeAI and more.

zackify 2 days ago

Why are people running ollama on public servers.

Is this thanks to everyone thinking they can code now and not understanding what they’re doing.

Make it make sense

[-]

NitpickLawyer 2 days ago

This has nothing to do with "everyone thinking they can code now", come on! People aren't asking cc to setup their cloud instances of ollama, they're likely getting a c/p line from a tutorial, just like they've always done.

What's likely happening here is that people are renting VMs and one-line some docker-compose up thing from a tutorial. And because it's a tutorial and people can't be bothered to tunnel their own traffic, most likely those tutorials are binding on 0.0.0.0.

Plenty of ways to footgun yourself with c/p something from a tutorial, even if you somewhat know what you're doing. No need to bring "everyone thinking they can code" into this. This is a tale as old as the Internet.

Another thing is that docker, being the helpful little thing that it is, in its default config will alter your firewall and open up ports even if you have a rule to drop everything you're not specifically using. So, yeah. That's probably what's happening.

[-]

2 days ago

[deleted]

stoneyhrm1 2 days ago

I understand the concern here but isn't this the same as making any other type of server public? This is just regarding servers hosting LLMs, which I wouldn't even consider a huge security concern vs hosting a should-be-internal tool publicly.

Servers that shouldn't be made public are made public, a cyber tale as old as time.

[-]

cube00 2 days ago

> servers hosting LLMs, which I wouldn't even consider a huge security concern

The new problem is if the LLMs are connected to tooling.

There's been plenty of examples showing that with subtle changes to the prompt you can jailbreak the LLM to execute tooling in wildly different ways from what was intended.

They're trying to paper over this by having the LLM call regular code just so they can sure all steps of the workflow are actually executed reliably every time.

Even the same prompt can give different results depending on the temperate used. How security teams are able to sign these things off is beyond me.

[-]

_flux 2 days ago

The tools are client side operations in Ollama, so I don't see a way an attacker could use that to their benefit, except to leverage the actual computing power the server provides.

deadbabe 2 days ago

The stakes aren’t that high yet for Ollama to warrant cumbersome auth mechanisms.

[-]

reilly3000 2 days ago

If any MCP servers are running, anyone with access to query the chat endpoint can use them. That could include file system access, GitHub tokens and more.

[-]

jangxx 2 days ago

ollama can't connect to MCP servers, it can merely run models which output instructions back to a connected system to connect to an MCP server (e.g mcphost using ollama to run a prompt and then itself connecting to an MCP server if the response requires it).

stoneyhrm1 2 days ago

The LLM endpoint via ollama or huggingface is not the one executing MCP tool calls, that is on behalf of the client that is interacting with the LLM. All the LLM does is take input as a prompt and produce a text output, that's it. Anything else is just a wrapper.

deadbabe 2 days ago

That is is completely false, ollama has nothing to do with running commands, it just processes prompts to text responses.

jychang 2 days ago

Yeah, I don't think most people who even run ollama would care. "Oh no, someone found my exposed instance, which means my computer in my bedroom is burning electricity for the past few hours. Oh well, I lost a few pennies in electricity." Shuts down Ollama on the computer.

Seriously, this is extremely mild as far as issues go. There's basically no incentive to fix this problem, because I bet even the people who lost a few pennies of electricity would still prefer the convenience of ollama not having auth.

Plus, that's the worst case scenario, in real life even if some black hat found an exposed ollama service, they have no interest in generating tokens for <insert random LLM here at 4 bit quant> at a slow speed of <50tok/sec.

[-]

dns_snek 2 days ago

If you think that's the worst case scenario you're in no position to be making security-related decisions. That line of thinking hinges on a very dangerous assumption that Ollama doesn't have any critical security vulnerabilities [1].

Don't expose services to the public internet unless they have been battle hardened to be exposed to the public internet, e.g. Nginx as an authenticating reverse proxy.

[1] https://github.com/advisories/GHSA-vq2g-prvr-rgr4

[-]

_flux 2 days ago

In general Go programs are quite secure against remote code execution kind class of attacks.

Even this one would be remedied by not running ollama as root and not have its binaries owned by the user it is running as (though overwriting executables/libraries that are being mmapped as executables is usually not possible), which I hope would be the standard mode of its setup.

[-]

dns_snek a day ago

I don't know why you would say that about Go, you're never more than one programming error away from creating a RCE vulnerability, no matter the language. Linked RCE should demonstrate that quite clearly, don't you think?

Either way my point is that software contains vulnerabilities, especially software that hasn't been hardened to be exposed to the public internet. Exposing it to the public internet anyway is a display of bad judgement, doubly so when the person responsible seems to believe that the worst thing that can happen is someone using the software as intended. Details of specific vulnerabilities are really beside the point here.

Assuming that the happy path is the worst that can happen is simply naive, there's no two ways about it.

[-]

_flux a day ago

As I understand it, overwhelmingly large majority of CVEs over the history of computing have been due to buffer overflows or use-after-free. If you leave out those vectors, you might actually be pretty close to having RCE-free piece of software.

But sure, it's always possible to be more innovative about how to go about enabling RCEs, like the log4j case demonstrates..

42lux 2 days ago

Is that agency over yourself called vibe living?

ekianjo 2 days ago

That is assuming you cannot exploit the server to get access to the machine...

mkrecny 2 days ago

largely the fault of n8n

moralestapia 2 days ago

Cheap (almost free) highly parallel inference. Nice!

hoppp 2 days ago

Free inference. Yay

aabdel0181 2 days ago

how many ppl are using Ollama in production though

[-]

simonw 2 days ago

"Our study uncovered over 1,100 exposed Ollama servers, with approximately 20% actively hosting models susceptible to unauthorized access."

So at least 1,100.

2OEH8eoCRo0 a day ago

I'm surprised Shodan is legal. Just because someone made a mistake when setting up their network doesn't mean you're authorized.

BananaaRepublik 2 days ago

Shodan? Like from system shock?

[-]

lupusreal 2 days ago

That's where the name comes from. It's a search engine for finding servers exposed to the public.

andygeorge 2 days ago

Another great use of a personal VPN - I work at https://www.defined.net (which uses Nebula as the underlying VPN technology) and also personally use our free tier (up to 100 hosts) for everything. Having my Ollama instances available only over my VPN overlay network is very slick.

ekianjo 2 days ago

Ollama has no auth mechanism by default... You have to wonder why they never focused on that

[-]

47282847 2 days ago

Separation of concerns?

If you deploy a power plug outside your house, is it the fault of the power plug designer if people steal your power?

Put it behind a webserver with basic auth or whatever you fancy, done.

[-]

ekianjo 2 days ago

Bad analogies are bad analogies. ollama is a server system, it should expect to connect with more than one client and they know very well by now that this also means networked clients. If you create a server client protocol, implementing security is your job.

[-]

phito 2 days ago

Any decent router is going to block connections from internet to your local network by default. For ollama to be accessible from the outside, they had to allow it explicitly. There's no way to blame ollama for this.

graemep 2 days ago

Lots of servers do not, Redis for instance does not have auth by default, and IIRC did not have auth at all for a long time.

Zambyte 2 days ago

> If you create a server client protocol, implementing security is your job.

Yes, this goes right along with the tried and true Unix philosophy: do everything, poorly. Wait what?

jrm4 2 days ago

I cannot express how deeply wrong you are about this; a "server system" is not some mandate that it should be production ready for a ton of people on the internet.

This is a program that very different people want or need to try out that just so happens to involve a client-server architecture.

kube-system 2 days ago

The client-server pattern is frequently used locally.

A4ET8a8uTh0_v2 2 days ago

As cynical as I am, I honestly don't think there is much to wonder about here. The initial product's adoption relied on low friction and minimal setup. That they wanted to keep it going as long as possible is just an extension of this.

Zambyte 2 days ago

The dockerd TCP socket has no auth mechanism by default... You have to wonder why they never focused on that.

cedws 2 days ago

I don’t think it was intended for production workloads.

muldvarp 2 days ago

Should have asked an LLM to write one.

rvz 2 days ago

[flagged]

[-]

Gormo 2 days ago

Ollama doesn't run a web server that is "broadcasting across the internet". It runs a server that is accessible locally. You have to deliberately deploy it onto a public server in order for it to be accessible from the internet.

[-]

rvz 2 days ago

In all cases, having zero auth at all [0] even when others want to use it as a service to broadcast across the internet is ridiculous. Leading to problems like this: [1] and now all exposed without any protection.

Even allowing others to change the $OLLAMA_HOST env is a security footgun.

[0] https://github.com/ollama/ollama/issues/849

[1] https://www.wiz.io/blog/probllama-ollama-vulnerability-cve-2...

[-]

Gormo 2 days ago

The idea is that you add an auth layer if that's what you want to do.

The majority of Ollama users at the moment are likely hobbyists working in single-user contexts.

For those who want to deploy it in an organizational setting, it's straightforward to put it behind a pre-existing authenticaton system.