Fast cryptographically safe GUID generator for Go

(github.com)

20 points | by sdrapkin a day ago ago

42 comments

deathanatos a day ago

These aren't GUIDs[1].

If it isn't meant to follow the RFC, … just find a new word. (There are plenty of alternate schemes out there, too.)

[1]: https://www.rfc-editor.org/rfc/rfc9562.html

[-]

sdrapkin a day ago

IMHO "Guid" is just as well known (Wikipedia agrees: https://en.wikipedia.org/wiki/Universally_unique_identifier), and "UUID" was already taken by Google.

[-]

saclark11 a day ago

> "UUID" was already taken by Google.

This shouldn't really matter as your import paths are obviously different. `github.com/google/uuid` and `github.com/sdrapkin/guid` can happily coexist. Any file/codebase importing both (which would ideally be avoided in the first place) can alias them.

> IMHO "Guid" is just as well known

I think the point the commenter was trying to make is that these do not adhere to the UUID spec. You don't specify which version, but judging by the docs and your comparison to `github.com/google/uuid`, I'd wager most folks looking at this library would assume they are supposed to be V4 UUIDs.

[-]

sdrapkin a day ago

> This shouldn't really matter as your import paths are obviously different.

I'm aware of that, of course. Guid is intentionally named differently from "uuid" (both as a package and as a type) to ensure there is no confusion between them in code. It is not the goal of Guid to mimic/inherit all uuid APIs. Guid is its own package, with a different API surface and roadmap (ie. I'll borrow what makes sense and do things differently when it makes sense).

[-]

shakna 15 hours ago

The spec uses both UUID and GUID. You can expect the same thing for both.

> This specification defines UUIDs (Universally Unique IDentifiers) -- also known as GUIDs (Globally Unique IDentifiers) -- and a Uniform Resource Name namespace for UUIDs.

danbruc a day ago

I think the point is that this just generates 16 random bytes whereas UUIDs/GUIDs have structure, they at least have a variant fields indicating what kind of UUID/GUID it is. The closest thing to all random bytes would be variant 10xx, version 4 or 8.

[-]

sdrapkin a day ago

You are correct - Guid very specifically and intentionally generates a structure of 16 random bytes. In decades of programming I've never needed a random 16-byte structure to have a "internal versioned structure". In very rare cases this is truly needed, bit-twiddling post-generation can cheaply fix it (but not the other way around). Which is why all these "versions" and "variants" in standard universally applicable libraries are a complete waste of entropy and cycles.

[-]

danbruc a day ago

I do not think I have seen or noticed code that inspects UUID variants either but I could certainly imagine that such code is out there, for example to protect against accidental information leakage from UUID variants that are not purely random. With that in mind it seems a good idea to adhere to the standards if one uses an established name. Neither the few lost bits nor the effort to correctly indicate the variant sound like real issues to me.

[-]

taeric a day ago

I was on a team once that would add server information to ids. Between that and using the version that has a date on it, it made debugging things MUCH easier. Just plug in the id and tooling could easily determine which logs to look into for when it was generated. Obviously, you may still need to widen your search for many reasons. But it is hard for me to think this is on most people's threat model.

majewsky a day ago

> "UUID" was already taken by Google

Your link also says that the term UUID predates the founding of Google by over a decade.

saclark11 a day ago

Advertising any UUID/GUID generator as cryptographically secure, or relying on it to be so, is a mistake, in my opinion.

You use a UUID when you need a universally unique ID whose guessability properties are not a critical security requirement. While the V4 UUID spec (which this package does not implement, but most users might assume it does) states that a UUID implementation SHOULD be cryptographically secure [1], it also states that they MUST NOT be used as security capabilities [2]. This is b/c they are not intended as secure tokens, but many users mistakenly assume them to be suitable as such. Not to mention, V4 UUIDs only have 122 bits of entropy, not 128, since 6 bits are reserved for version and variant information, which many users don't realize.

So you can generate a UUID that is suitable as a secure token, but at that point don't call it a UUID. Just call it a secure token. And if you need a secure token, use something like Go's `Text()` function from `crypto/rand` [3].

The situation reminds me of how the Go team updated the `math/rand` and `math/rand/v2` packages to use a CSPRNG as a defensive measure [4], while still urging users to use `crypto/rand` in secure contexts.

[1]: https://www.rfc-editor.org/rfc/rfc9562.html#unguessability

[2]: https://www.rfc-editor.org/rfc/rfc9562.html#Security

[3]: https://pkg.go.dev/crypto/rand@go1.24.5#Text

[4]: https://go.dev/blog/chacha8rand

[-]

sdrapkin a day ago

The vast majority of Golang developers would benefit from using Guid library instead of UUID library. It’s substantially faster in all cases, more secure (by 2^6) and has more functionality.

For random token-as-string generation Golang developers should be using https://github.com/sdrapkin/randstring instead of crypto/rand.Text (faster and more flexible).

[-]

stouset a day ago

The vast majority of Golang developers are neither hobbled by the lack of gigabyte throughput for random identifier generation nor are they on the verge of becoming victims to attacks on identifiers with "only" 2^122 random bits.

[-]

sdrapkin a day ago

Agreed. So at worst they (Golang developers) should be indifferent, and at best they should opt for the faster choice. With serverless code billing by the second, faster choices are directly correlated to lower costs.

[-]

lossolo a day ago

> With serverless code billing by the second, faster choices are directly correlated to lower costs.

The kind of Go developers who think about these optimizations don't use overpriced, inefficient serverless services.

amluto a day ago

On an extremely quick review:

- This uses global state under the hood. Surprise! Is it thread safe? I’m not a Go expert, but it looks non-thread-safe.

- The copying code reminds me of old-school awful C buffer handling code. Maybe it’s right. Maybe it’s wrong. But it’s not obviously right.

- The actual meat is a cryptographic randomness cache. This is a subtle thing, and all the best practices are missing. Where’s the backtracking protection? What if the program forks? vDSO getrandom() knows how to do this correctly — something high-level should use it, not reimplement it incorrectly.

[-]

Groxx a day ago

Global use looks fine - it's a very-simply-used sync pool to do larger blocks of rand reads, which makes plenty of sense for performance.

Unsafe use also looks fine, values either don't escape the function (a type string->byte type cast for function signature reasons) or they do but they're new temporary data (the byte->string cast, which is fine because there's no risk of reusing or modifying the original bytes).

I'm going to intentionally not make any claims to "cryptographic security" or "is this a GUID" as I'm not super clear on the details there. The code looks pretty normal to me though, with the possible exception of the base64 encoding (why not base64.URLEncoding? https://pkg.go.dev/encoding/base64#pkg-variables).

sa46 a day ago

> This uses global state under the hood.

Looks safe to me. It uses `crypto/rand.Read` which is declared as safe for concurrent use. The cache is accessed via sync.Pool which is thread safe. As a check, I ran the tests with `-race` and it passed.

sdrapkin a day ago

Thanks for your feedback. If you are skilled in Golang, I suggest you review the code more thoroughly for a more accurate understanding (especially compared to what standard uuid does).

imiric a day ago

My understanding was that speed is not something you want in a UUID generator, since it makes it more susceptible to brute force attacks. Is this not the case?

I've been using Cuid2[1] in most of my personal projects (this Go implementation[2], actually), which is fast enough, but not "too fast". It's also secure, collision resistant, and has everything I would need from a UUID.

[1]: https://github.com/paralleldrive/cuid2

[2]: https://github.com/nrednav/cuid2

[-]

stouset a day ago

> My understanding was that speed is not something you want in a UUID generator, since it makes it more susceptible to brute force attacks. Is this not the case?

The only possible think I can think of here is using a UUID version with a small space for the random bits, such that you could accidentally collide by generating them too fast. But with something like UUIDv7, you'd need to be generating hundreds of millions of random UUIDs every nanosecond in order for that to be a realistic concern.

sdrapkin a day ago

cuid2 generates variable-length strings. If you want fast cryptographically strong string generation, I recommend https://github.com/sdrapkin/randstring. It will likely be faster than cuid2.

[-]

imiric a day ago

That doesn't address what I said. Nor explains why your package is better.

[-]

sdrapkin a day ago

Guid package generates guids/uuids. Your linked package generates variable length strings. These are different usecases (oh, and your benchmarks are inferior to https://github.com/sdrapkin/randstring). Nothing to argue about.

[-]

shakna a day ago

But this doesn't generate guid/uuids? It generates random bytes.

[-]

sdrapkin a day ago

Guid/uuid is defined as a 16-byte structure. Are you questioning the “byte” part, or the “random” part?

[-]

shakna 15 hours ago

No, its defined to a series of specifications. [0] Ones that define an underlying structure, in bits.

You have a 16byte random string. Thats great. But it is not a UUID.

[0] https://www.rfc-editor.org/rfc/rfc9562.html

> The UUID format is 16 octets (128 bits) in size; the variant bits in conjunction with the version bits described in the next sections determine finer structure.

[-]

sdrapkin 13 hours ago

No, Guid/uuids are defined as 128-bit labels used to uniquely identify objects in computer systems. This 128-bit/16-byte definition predates any RFCs that one may or may not choose to implement. I'm obviously aware of RFC 9562, and nowhere in the Guid library do I claim implementation of it. RFC 9562 is a choice, and one that should not be made blindly, or for you. It all starts with 16 random bytes. Google's uuid starts that way, and virtually every other Guid/uuid implementation. Then, on top of that building block, one may tweak additional non-random bits if the usecase truly requires it. If it does - you can do it quickly and cheaply on top of 16 random bytes. If the usecase does not require it (99% of cases), you're better off with the foundational 16 random bytes. The perspective of "your 16 random bytes do not implement RFC 9562 - BAD, BAD!" is very myopic. But if wasting bits on versions and variants is something that helps someone sleep better - they can easily and cheaply achieve that with a couple of bit ops. RFC 9562 robs developers of that choice.

[-]

shakna 12 hours ago

Ok... But if you want to ignore the last twenty years, you should probably pick another name, because it has been used a particular way for two decades.

If you want "more choice" - use a name unbound by a tradition old enough to drink.

imiric a day ago

No need to argue. You just haven't addressed the point that a fast UUID generator is a security risk. I don't care about benchmarks.

And in most use cases where I'd need a UUID, I'd usually want the string representation of it.

[-]

sdrapkin a day ago

Fast guid/uuid generators are NOT a security risk. You want such generators to be as fast as possible, without compromising cryptographic strength.

sdrapkin a day ago

Much faster (~10x) than standard github.com/google/uuid package

I'm interested in feedback from the HN community.

[-]

evil-olive a day ago

what real-world problem, if any, does 10x faster UUID generation solve?

from your readme, `guid.New()` is 6~10 ns, so presumably the standard UUID package takes 60-100 ns?

say I generate a UUID, and then use that UUID when inserting a row into my database, let's say committing that transaction takes 1 msec (1 million ns)

if I get a speedup of 90 ns from using a faster UUID package, will that even be noticeable in my benchmarks? it seems likely to be lost in the noise.

honestly, this seems like going on a 7-day road trip, and sprinting from your front door to your car because it'll get you there faster.

[-]

sdrapkin a day ago

Amazon AWS S3 web servers process millions of requests per second, and each response generates a random Request-Id. It’s not exactly 16 bytes, but this is a very realistic scenario where guids are used in hot path. If you are writing a cute-kitten blog, might as well use Python instead..

throwaway894345 a day ago

Why is it so much faster than `uuid`?

[-]

sdrapkin a day ago

It generates entropy 4kb-at-a-time (instead of on each call), and uses a cache-pool instead of single cache behind a lock (which is what standard uuid does in "RandPool=ON" mode).

[-]

maxmcd a day ago

Ah cool, the note here is also interesting: https://pkg.go.dev/github.com/google/uuid#EnableRandPool

cyberax a day ago

So this automatically makes it unsafe in case of VM snapshots.

The Linux kernel now has an optimization that makes it safe: https://lwn.net/Articles/983186/

Go should automatically benefit from this, if they use the vDSO getrandom().

CafeRacer a day ago

Would have been nice if that included timestamp information, to make them orderable. Similar to what uuid v7 does.

[-]

sdrapkin a day ago

It's on the roadmap (already implemented in a similar .NET library - https://github.com/sdrapkin/SecurityDriven.FastGuid).

sdrapkin a day ago

In case you missed it, "guid.Read()" is a much faster alternative to "crypto/rand". https://pkg.go.dev/github.com/sdrapkin/guid#Read

gg-plz a day ago

[dead]