Bucketsquatting is (finally) dead

(onecloudplease.com)

83 points | by boyter 2 hours ago ago

30 comments

josephg 29 minutes ago

Sometimes I wonder if package names, bucket names, github account names and so on should use a naming scheme like discord. Eg, @sometag-xxxx where xxxx is a random 4 digit code. Its sort of a middleground between UUID account names and completely human generated names.

This approach goes a long way toward democratizing the name space, since nobody can "own" the tag prefix. (10000 people can all share it). This can also be used to prevent squatting and reuse attacks - just burn the full account name if the corresponding user account is ever shut down. And it prevents early users from being able to snap up all the good names.

[-]

jorams 17 minutes ago

Notably Discord stopped using that format two years ago, moving to globally unique usernames.

Their stated reason[1] for doing so being:

> This lets you have the same username as someone else as long as you have different discriminators or different case letters. However, this also means you have to remember a set of 4-digit numbers and account for case sensitivity to connect with your friends.

[1]: https://support.discord.com/hc/en-us/articles/12620128861463...

rithdmc 22 minutes ago

I like it for buckets, but adding a four digit code won't help with the package hijacking side of things - in fact might just introduce more typo/hijack potential. It'll just be four more characters for people to typo.

donmcronald 24 minutes ago

I just want to be able to use a verified domain; @example.com everywhere.

[-]

Cthulhu_ 22 minutes ago

That still has "squatting" risks as described in the original article though, domains expire and / or can be taken over.

vhab an hour ago

> For Azure Blob Storage, storage accounts are scoped with an account name and container name, so this is far less of a concern.

The author probably misunderstood what "account name" is in Azure Storage's context, as it's pretty much the equivalent of S3's bucket name, and is definitely still a large concern.

A single pool of unique names for storage accounts across all customers has been a very large source of frustration, especially with the really short name limit of only 24 characters.

I hope Microsoft follows suit and introduces a unique namespace per customer as well.

[-]

iann0036 an hour ago

Author here. Thanks for the call out! I've updated the article with attribution.

ryanjshaw an hour ago

I recall being shocked the first time I used Azure and realizing so many resources aren’t namespaced to account level. Bizarre to me this wasn’t a v1 concern.

iknownothow an hour ago

Thank you author Ian Mckay! This is one of those good hygiene conventions that save time by not having to think/worry each time buckets are named. As pointed out in the article, AWS seems to have made this part of their official naming conventions [1].

I'm excited for IaC code libraries like Terraform to incorporate this as their default behavior soon! The default behavior of Terraform and co is already to add a random hash suffix to the end of the bucket name to prevent such errors. This becoming standard practice in itself has saved me days in not having to convince others to use such strategies prior to automation.

[1] https://aws.amazon.com/blogs/aws/introducing-account-regiona...

calmworm 2 hours ago

That took a decade to resolve? Surprising, but hindsight is 20/20 I guess.

INTPenis an hour ago

I started treating long random bucketnames as secrets years ago. Ever since I noticed hackers were discovering buckets online with secrets and healthcare info.

This is where IaC shines.

[-]

Galanwe an hour ago

This is all good and we'll on the IaC side,yes. But at the end of the day, buckets are also user facing resources, and nobody likes random directory / bucket names.

[-]

amluto 23 minutes ago

It would be nice if the other end of this could be addressed: a configurable policy to limit resolution of bucket names within an account namespace. Ideally, if someone doesn’t have permission to resolve a bucket name, they shouldn’t even be able to detect whether it exists.

XorNot an hour ago

I just started using hashes for names. The deployment tooling knows the "real" name. The actual deployment hash registers a salt+hash of that name to produce a pseudo-random string name.

alemwjsl 44 minutes ago

I take it advertising your account id isn't a security risk?

[-]

Cthulhu_ 21 minutes ago

Armchair opinion, but shouldn't be too bad - it's identification, not authentication, just like your e-mail address is.

But probably best to not advertise it too much.

aduwah 40 minutes ago

It is not hygienic, but with only the account-id you are fine. In the IAM rules the attacker can always just use a * on their end, so it does not make a difference. You have to be conscious to set proper rules for your (owner) end tho.

Aardwolf an hour ago

Why all that stuff with namespaces when they could just not allow name reuse?

[-]

orf 24 minutes ago

That would be a huge breaking change. Any workload that relies on re-using a bucket name would be broken, and at the scale of S3 that would have a non-trivial customer impact.

Not to mention the ergonomics would suck - suddenly your terraform destroy/apply loop breaks if there’s a bucket involved

[-]

afandian 18 minutes ago

Any workload that relies on re-using a bucket name is broken by design. If someone else can get it, then it's Undefined Behaviour. So it's in keeping with the contract for AWS to prevent re-use. Surely?

[-]

orf 14 minutes ago

Think terraform tests, temporary environments, etc. Or anything else: it’s Hyrum's Law.

iknownothow an hour ago

Potential reasons I can think of for why they don't disallow name reuse:

a) AWS will need to maintain a database of all historical bucket names to know what to disallow. This is hard per region and even harder globally. Its easier to know what is currently in use rather know what has been used historically.

b) Even if they maintained a database of all historically used bucket names, then the latency to query if something exists in it may be large enough to be annoying during bucket creation process. Knowing AWS, they'll charge you for every 1000 requests for "checking if bucket name exists" :p

c) AWS builds many of its own services on S3 (as indicated in the article) and I can imagine there may be many of their internal services that just rely on existing behaviour i.e. allowing for re-creating the same bucket name.

[-]

dwedge 33 minutes ago

I can't accept a) or b). They already need to keep a database of all existing bucket names globally, and they already need to check this on bucket creation. Adding a flag on deleted doesn't seem like a big loss.

As for c), I assume it's not just AWS relying on this behaviour. https://xkcd.com/1172/

CodesInChaos an hour ago

I'd allow re-use, but only by the original account. Not being able to re-create a bucket after deleting it would be annoying.

I think that's an important defense that AWS should implement for existing buckets, to complement account scoped bucket.

thih9 an hour ago

> If you wish to protect your existing buckets, you’ll need to create new buckets with the namespace pattern and migrate your data to those buckets.

My pet conspiracy theory: this article was written by bucket squatters who want to claim old bucket names after AI agents read this and blindly follow.

lijok 2 hours ago

Huh? Hash your bucket names

[-]

why_only_15 2 hours ago

if your bucket name is ever exposed and you later delete it, then this doesn't help you.

[-]

lijok 15 minutes ago

The entire article talks about “guessing” the bucket name as being the attack enabler, not the leaking of it. What does the landscape look like once you start doing the basics like hashing your bucket names? Is this still a problem worth engineering for?

Maxion 2 hours ago

I don't think that'd prevent this attack vector.

[-]

alemwjsl an hour ago

Ok; salt, and then hash your bucket names