NPM to implement staged publishing after turbulent shift off classic tokens

(socket.dev)

197 points | by feross a day ago ago

114 comments

fergie 8 hours ago

In all of this, people forget that NPM packages are largely maintained by volunteers. If you are going to put up hurdles and give us extra jobs, you need to start paying us. Open source licenses explicitly state some variation of "use at your own risk". A big motivation for most maintainers is that we can create without being told what to do.

I had 25 million downloads on NPM last year. Not a huge amount compared to the big libs, but OTOH, people actually use my stuff. For this I have received exactly $0 (if they were Spotify or YouTube streams I would realistically be looking at ~$100,000).

I propose that we have two NPMs. A non-commercial NPM that is 100% use at your own risk, and a commerical NPM that has various guarantees that authors and maintainers are paid to uphold.

[-]

jeroenhd 7 hours ago

NPM has to decide between either being a friendly place for hobbyists to explore their passions or being the backbone for a significant slice of the IT industry.

Every time someone pulls/messes with/uploads malware to NPM, people complain and blame NPM.

Every time NPM takes steps to prevent pulling/messing with/uploading malware to NPM, people complain and blame NPM.

I don't think splitting NPM will change that. Current NPM is already the "100% use at your own risk" NPM and still people complain when a piece of protestware breaks their build.

onion2k 4 hours ago

If you are going to put up hurdles and give us extra jobs, you need to start paying us.

Alternatively, we can accept that there will be fewer libraries because some volunteers won't do the extra work for free. Arguably there are too many libraries already so maybe a contraction in the size of the ecosystem would be a net positive.

[-]

jacquesm 4 hours ago

Note: the bad guys are incentivized to work for free, this would increase the problem considerably.

justarandomname 8 hours ago

I agree with you here, it feels like management said: "well, we have to do SOMETHING!" and this is what they chose: push more of the burden on to the developers giving away stuff for free when the burden should be on the developers and companies consuming the stuff for free.

pamcake 7 hours ago

Not looking forward to the mandatory doxxing that would probably come along if this was introduced today.

[-]

fergie 7 hours ago

This makes no sense, maintainers are not exactly operating under a cloak of anonymity. Quite the opposite in fact.

borplk 5 hours ago

Yes! I despise how the open source and free software culture turns into just free labour for freeloading million-dollar and billion-dollar companies.

The culture made sense in the early days when it was a bunch of random nerds helping each other out and having fun. Now the freeloaders have managed to hijack it and inject themselves into it.

They also weaponise the culture against the devs by shaming them for wanting money for their software.

Many companies spend thousands of dollars every month on all sorts of things without much thought. But good luck getting a one-time $100 license fee out of them for some critical library that their whole product depends on.

Personally I'd like to see the "give stuff to them for free then beg and pray for donations" culture end.

We need to establish a balance based on the commercial value that is being provided.

For example I want licensing to be based on the size and scale of the user (non-commercial user, tiny commercial user, small business, medium business, massive enterprise).

It's absurd for a multi-million company to leech off a random dev for free.

[-]

graemep 3 hours ago

I have no idea how much of this stuff is volunteer written, and how much is paid work that is open-sourced.

No one if forced to use these licences. Even some FOSS licences such as AGPL will not be used by many companies (even the GPL where its software that is distributed to users). You could use a FOSS license and add an exemption for non-commercial use, or use a non-FOSS license that is free for non-commercial use or small businesses.

On the other hand a lot of people choose permissive licenses. I assume they are happy to do so.

woodruffw a day ago

> In its current form, however, trusted publishing applies to a limited set of use cases. Support is restricted to a small number of CI providers, it cannot be used for the first publish of a new package, and it does not yet offer enforcement mechanisms such as mandatory 2FA at publish time. Those constraints have led maintainer groups to caution against treating trusted publishing as a universal upgrade, particularly for high-impact or critical packages.

This isn't strictly accurate: when we designed Trusted Publishing for PyPI, we designed it to be generic across OIDC IdPs (typically CI providers), and explicitly included an accommodation for creating new projects via Trusted Publishing (we called it "pending" publishers[1]). The latter is something that not all subsequent adopters of the Trusted Publishing technique have adopted, which is IMO both unfortunate and understandable (since it's a complication over the data model/assumptions around package existence).

I think a lot of the pains here are self-inflicted on GitHub's part here: deciding to remove normal API credentials entirely strikes me as extremely aggressive, and is completely unrelated to implementing Trusted Publishing. Combining the two together in the same campaign has made things unnecessarily confusing for users and integrators, it seems.

[1]: https://docs.pypi.org/trusted-publishers/creating-a-project-...

[-]

veeti 17 hours ago

Yet in practice, only the big boys are allowed to become "Trusted Publishers":

> In the interest of making the best use of PyPI's finite resources, we only plan to support platforms that have a reasonable level of usage among PyPI users for publishing. Additionally, we have high standards for overall reliability and security in the operation of a supported Identity Provider: in practice, this means that a home-grown or personal use IdP will not be eligible.

How long until everyone is forced to launder their artifacts using Microsoft (TM) GitHub (R) to be "trusted"?

[1] https://docs.pypi.org/trusted-publishers/internals/#how-do-i...

[-]

woodruffw 17 hours ago

I wrote a good chunk of those docs, and I can assure you that the goal is always to add more identity providers, and not to enforce support for any particular provider. GitHub was only the first because it’s popular; there’s no grand evil theory beyond that.

[-]

VorpalWay 11 hours ago

So if I self host my own gitea/forgejo instance, will trusted publishing work for me?

[-]

woodruffw 8 hours ago

If you had enough users and demonstrated the ability to securely manage a PKI, then I don’t see why not. But if it’s just you on a server in your garage, then there would be no advantage to either you or to the ecosystem for PyPI to federate with your server.

That’s why API tokens are still supported as a first-class authentication mechanism: Trusted Publishing is simply not a good fit in all possible scenarios.

[-]

pamcake 8 hours ago

> if it’s just you on a server in your garage, then there would be no advantage to either you or to the ecosystem for PyPI to federate with your server.

Why not leave decision on what providers to trust to users, instead of having a centrally managed global allowlist at the registry? Why should he registry admin be the one to decide who is fit to publish for each and all packages?

[-]

woodruffw 7 hours ago

> Why not leave decision on what providers to trust to users, instead of having a centrally managed global allowlist at the registry?

We do leave it to users: you can always use an API token to publish to PyPI from your own developer machine (or server), and downstreams are always responsible for trusting their dependencies regardless of how they’re published.

The reason Trusted Publishing is limited at the registry level is because it takes time and effort (from mostly volunteers) to configure and maintain for each federated service, and the actual benefit of it rounds down to zero when a given service has only one user.

> Why should he registry admin be the one to decide who is fit to publish for each and all packages?

Per above, the registry admin doesn’t make a fitness decision. Trusted Publishing is an optional mechanism.

(However, this isn’t to say that the registry doesn’t reserve this right. They do, to prevent spamming and other abuse of the service.)

acdha 7 hours ago

They’re running the most popular registry but nothing says you can’t use your own to implement whatever policy you want. The default registry has a tricky balance of needing to support inexperienced users while also only having a very modest budget compared to the companies which depend on it, and things like custom authentication flows are disproportionately expensive.

[-]

pamcake 7 hours ago

What's the issue exactly?

They seem to manage to handle account signups with email addresss from unknown domain names just as fine as for hotmail.com and gmail.com. I don't see how this is any different.

The whole point of standards like OIDC (and supposedly TP) is that there is no need for provider-specific implemenations or custom auth flows as long as you follow the spec and protocol. It's just some fields that can be put in a settings UI configurable by the user.

[-]

woodruffw 7 hours ago

It’s completely different. An email signup doesn’t involve a persistent trust relationship between PyPI and an OIDC identity provider. The latter imposes code changes, availability requirements, etc.

(But also: for completely unrelated reasons, PyPI can and will ban email domains that it believes are sources of abuse.)

eesmith 11 hours ago

According to their docs, they have a "have high standards for overall reliability and security in the operation of a supported Identity Provider: in practice, this means that a home-grown or personal use IdP will not be eligible."

If you think your setup meets those standards, you'll need to use Microsoft (TM) GitHub (R) to contact them.

[-]

VorpalWay 10 hours ago

In other words, it is a clear centralization drive. No two ways about it.

[-]

eesmith 8 hours ago

PyPI is already centralized.

Back when I started with PyPI, manual upload through the web interface was the only possibility. Have they gotten rid of that?

My understanding is that "trusted publishing"[0] was meant as an additional alternative to that sort of manual processing. It was never decentralized. As I recall, the initial version only supported GitHub and (I think) GitLab.

[0] I do not trust Microsoft as an intermediary to my software distribution. I don't use Microsoft products or services, including GitHub.

Yes, this makes contacting PyPI support via GitHub impossible for me. That is one of the reasons I stopped using PyPI and instead distribute my wheels from my own web site.

jopsen 9 hours ago

npm is centralized to start with, so how is this a problem?

epage a day ago

As I'm not familiar with the npm ecosystem so maybe I'm misunderstanding this but it sounds like they removed support for local publishes (via a token) in favor of CI publishing using Trusted Publishing.

If that is correct, I thought this was discussed when Trusted Publishing was proposed for Rust that it was not meant to replace local publishing, only harden CI publishing.

[-]

woodruffw a day ago

> If that is correct, I thought this was discussed when Trusted Publishing was proposed for Rust that it was not meant to replace local publishing, only harden CI publishing.

Yes, that's right, and that's how it was implemented for both Rust and Python. NPM seems to have decided to do their own thing here.

(More precisely, I think NPM still allows local publishing with an API token, they just won't grant long-lived ones anymore.)

[-]

the_mitsuhiko a day ago

I think the path to dependency on closed publishers was opened wide with the introduction of both attestations and trusted publishing. People now have assigned extra qualities to such releases and it pushes the ecosystem towards more dependency on closed CI systems such as github and gitlab.

It was a good intention, but the ramifications of it I don't think are great.

[-]

woodruffw a day ago

> People now have assigned extra qualities to such releases and it pushes the ecosystem towards more dependency on closed CI systems such as github and gitlab.

I think this is unfortunately true, but it's also a tale as old as time. I think PyPI did a good job of documenting why you shouldn't treat attestations as evidence of security modulo independent trust in an identity[1], but the temptation to verify a signature and call it a day is great for a lot of people.

Still, I don't know what a better solution is -- I think there's general agreement that packaging ecosystems should have some cryptographically sound way for responsible parties to correlate identities to their packages, and that previous techniques don't have a great track record.

(Something that's noteworthy is that PyPI's implementation of attestations uses CI/CD identities because it's easy, but that's not a fundamental limitation: it could also allow email identities with a bit more work. I'd love to see more experimentation in that direction, given that it lifts the dependency on CI/CD platforms.)

[1]: https://docs.pypi.org/attestations/security-model/

blibble 21 hours ago

> It was a good intention, but the ramifications of it I don't think are great.

as always, the road to hell is paved with good intentions

the term "Trusted Publishing" implies everyone else is untrusted

quite why anyone would think Microsoft is considered trustworthy, or competent at operating critical systems, I don't know

https://firewalltimes.com/microsoft-data-breach-timeline/

[-]

woodruffw 20 hours ago

> the term "Trusted Publishing" implies everyone else is untrusted

No, it just means that you're explicitly trusting a specific party to publish for you. This is exactly the same as you'd normally do implicitly by handing a CI/CD system a long-lived API token, except without the long-lived API token.

(The technique also has nothing to do with Microsoft, and everything to do with the fact that GitHub Actions is the de facto majority user demographic that needs targeting whenever doing anything for large OSS ecosystems. If GitHub Actions was owned by McDonalds instead, nothing would be any different.)

[-]

fc417fc802 14 hours ago

> This is exactly the same as you'd normally do implicitly by handing a CI/CD system a long-lived API token, except without the long-lived API token.

The other difference is being subjected to a whitelisting approach. That wasn't previously the case.

It's frustrating that seemingly every time better authentication schemes get introduced they come with functionality for client and third party service attestation baked in. All we ever really needed was a standardized way to limit the scope of a given credential coupled with a standardized challenge format to prove possession of a private key.

[-]

woodruffw 4 hours ago

> The other difference is being subjected to a whitelisting approach. That wasn't previously the case.

You are not being subjected to one. Again: you can always use an API token with PyPI, even on a CI/CD platform that PyPI knows how to do Trusted Publishing against. It's purely optional.

> All we ever really needed was a standardized way to limit the scope of a given credential coupled with a standardized challenge format to prove possession of a private key.

That is what OIDC is. Well, not for a private key, but for a set of claims that constitute a machine identity, which the relying party can then do whatever it wants with.

But standards and interoperability don't mean that any given service will just choose to federate with every other service out there. Federation always has up-front and long-term costs that need to be balanced with actual delivered impact/value; for a single user on their own server, the actual value of OIDC federation versus an API token is nil.

[-]

fc417fc802 2 hours ago

Right, I meant that the new scheme is subject to a whitelist. I didn't mean to imply that you can't use the old scheme anymore.

> Federation always has up-front and long-term costs

Not particularly? For example there's no particular cost if I accept email from outlook today but reverse that decision and ban it tomorrow. I don't immediately see a technical reason to avoid a default accept policy here.

> for a single user on their own server, the actual value of OIDC federation versus an API token is nil.

The value is that you can do away with long lived tokens that are prone to theft. You can MFA with your (self hosted) OIDC service and things should be that much more secure. Of course your (single user) OIDC service could get pwned but that's no different than any other account compromise.

I guess there's some nonzero risk that a bunch of users all decide to use the same insecure OIDC service. But you might as well worry that a bunch of them all decide to use an insecure password manager.

> Well, not for a private key, but for a set of claims that constitute a machine identity

What's the difference between "set of claims" and "private key" here?

That last paragraph in GP was more a tangential rant than directly on topic BTW. I realize that OIDC makes sense here. The issue is that as an end user I have more flexibility and ease of use with my SSH keys than I do with something like a self hosted OIDC service. I can store my SSH keys on a hardware token, or store them on my computer blinded so that I need a hardware token or TPM to unlock them, or lots of other options. The service I'm connecting to doesn't need to know anything about my workflow. Whereas self hosting something like OIDC managing and securing the service becomes an entire thing on top of which many services arbitrarily dictate "thou shalt not self host".

It's a general trend that as new authentication schemes have been introduced they have generally included undesirable features from the perspective of user freedom. Adding insult to injury those unnecessary features tend to increase the complexity of the specification. In contrast, it's interesting to think how things might work if what we had instead was a single widely accepted challenge scheme such as SSH has. You could implement all manner of services such as OIDC on top of such a primitive while end users would retain the ability to directly use the equivalent of an SSH key.

[-]

woodruffw an hour ago

> Not particularly? For example there's no particular cost if I accept email from outlook today but reverse that decision and ban it tomorrow. I don't immediately see a technical reason to avoid a default accept policy here.

Accepting email isn't really the same thing. I've linked some resources elsewhere in this thread that explain why OIDC federation isn't trivial in the context of machine identities.

> The value is that you can do away with long lived tokens that are prone to theft. You can MFA with your (self hosted) OIDC service and things should be that much more secure. Of course your (single user) OIDC service could get pwned but that's no different than any other account compromise.

You can already do this by self-attenuating your PyPI API token, since it's a Macaroon. We designed PyPI's API tokens with exactly this in mind.

(This isn't documented particularly well, since nobody has clearly articulated a threat model in which a single user runs their own entire attenuation service only to restrict a single or small subset of credentials that they already have access to. But you could do it, I guess.)

> What's the difference between "set of claims" and "private key" here?

A private key is a cryptographic object; a "set of claims" is (very literally) a JSON object that was signed over as the payload of a JWT. You can't sign (or encrypt, or whatever) with a set of claims naively; it's just data.

pamcake 18 hours ago

I mean, if it meant the infrastructure operated under a franchising model with distributed admin like McD, it would look quite different!

There is more than one way to interpret the term "trusted". The average dev will probably take away different implications than someone with your expertise and context.

I don't believe this double meaning is an unfortunate coincidence but part of clever marketing. A semantic or ideological sleight of hand, if you will.

In the same category: "Trusted Computing", "Zero trust" and "Passkeys are phishing-resistant"

[-]

woodruffw 17 hours ago

> I don't believe this double meaning is an unfortunate coincidence but part of clever marketing. A semantic or ideological sleight of hand, if you will.

I can tell you with absolute certainty that it really is just unfortunate. We just couldn’t come up with a better short name for it at the time; it was going to be either “Trusted Publishing” or “OIDC publishing,” and we determined that the latter would be too confusing to people who don’t know (and don’t care to know) what OIDC is.

There’s nothing nefarious about it, just the assumption that people would understand “trusted” to mean “you’re putting trust in this,” not “you have to use $vendor.” Clearly that assumption was not well founded.

[-]

fc417fc802 14 hours ago

Maybe signed publishing or verified publishing would have been better terms?

[-]

woodruffw 8 hours ago

It’s neither signed or verified, though. There’s a signature involved, but that signature is over a JWT not over the package.

(There’s an overlaid thing called “attestations” on PyPI, which is a form of signing. But Trusted Publishing itself isn’t signing.)

[-]

fc417fc802 8 hours ago

Re signed - that is a fair point, although it raises the question, why is the distributed artifact not cryptographically authenticated?

Maybe I'm misunderstanding but I thought the whole point of the exercise was to avoid token compromise. Framed another way that means the goal is authentication of the CI/CD pipeline itself, right? Wouldn't signing a fingerprint be the default solution for that?

Unless there's some reason to hide the build source from downstream users of the package?

Re verified, doesn't this qualify as verifying that the source of the artifact is the expected CI/CD pipeline? I suppose "authenticated publishing" could also work for the same reason.

[-]

woodruffw 6 hours ago

> why is the distributed artifact not cryptographically authenticated?

With what key? That’s the layer that “attestations” add on top, but with Trusted Publishing there’s no user/package—associated signature.

> Maybe I'm misunderstanding but I thought the whole point of the exercise was to avoid token compromise. Framed another way that means the goal is authentication of the CI/CD pipeline itself, right? Wouldn't signing a fingerprint be the default solution for that?

Yes, the goal is to authenticate the CI/CD pipeline (what we’d call a “machine identity”). And there is a signature involved, but it only verifies the identity of the pipeline, not the package being uploaded by that pipeline. That’s why we layer attestations on top.

(The reasons for this are unfortunately nuanced but ultimately boil down to it being hard to directly sign arbitrary inputs with just OIDC in a meaningful way. I have some slides from talks I gave in the past that might help clarify Trusted Publishing, the relationship with signatures/attestations, etc.[1][2])

> I suppose "authenticated publishing" could also work for the same reason.

I think this would imply that normal API token publishing is somehow not authenticated, which would be really confusing as well. It’s really not easy to come up with a name that doesn’t have some amount of overlap with existing concepts, unfortunately.

[1]: https://yossarian.net/res/pub/packagingcon-2023.pdf

[2]: https://yossarian.net/res/pub/scored-2023.pdf

[-]

fc417fc802 6 hours ago

> imply that normal API token publishing is somehow not authenticated

Fair enough, although the same reasoning would imply that API token publishing isn't trusted ... well after the recent npm attacks I suppose it might not be at that.

> With what key?

> And there is a signature involved,

So there's already a key involved. I realize its lifetime might not be suitable but presumably the pipeline itself either already possesses or could generate a long lived key to be registered with the central service.

> but it only verifies the identity of the pipeline,

I thought verifying the identity of the pipeline was the entire point? The pipeline singing a fingerprint of the package would enable anyone to verify the provenance of the complete contents (either they'd need a way to look up the key or you could do TOFU but I digress). There's value in being able to verify the integrity of the artifacts in your local cache.

Also, the more independent layers of authentication there are the fewer options an attacker will have. A hypothetical artifact that carried signatures from the developer, the pipeline, and the registry would have a very clear chain of custody.

> it being hard to directly sign arbitrary inputs with just OIDC in a meaningful way

At the end of the day you just need to somehow end up in a situation where the pipeline holds a key that has been authenticated by the package registry. From that point on I'd think that the particular signature scheme would become a trivial implementation detail; you stuff the output into some json or something similar and get on with life.

Has some key complexity gone over my head here?

BTW please don't take this the wrong way. It's not my intent to imply that I know better. As long as the process works it isn't my intent to critique it. I was just honestly surprised to learn that the package content itself isn't signed by the pipeline to prove provenance for downstream consumers and from there I'm just responding to the reasoning you gave. But if the current process does what it set out to do then I've no grounds to object.

[-]

woodruffw 4 hours ago

> So there's already a key involved. I realize its lifetime might not be suitable but presumably the pipeline itself either already possesses or could generate a long lived key to be registered with the central service.

The key involved is the OIDC IdP's key, which isn't controlled by the maintainer of the project. I think it would be pretty risky to allow this key to directly sign for packages, because this would imply that any party that can use that key for signing can sign for any package. This would mean that any GitHub Actions workflow anywhere would be one signing bug away from impersonating signatures for every PyPI project, which would be exceedingly not good. It would also make the insider risk from a compromised CI/CD provider much larger.

(Again, I really recommend taking a look at the talks I linked. Both Trusted Publishing and attestations were multi-year projects that involved multiple companies, cryptographers, and implementation engineers, and most of your - very reasonable! - questions came up for us as well while designing and planning this work.)

> I thought verifying the identity of the pipeline was the entire point? The pipeline singing a fingerprint of the package would enable anyone to verify the provenance of the complete contents (either they'd need a way to look up the key or you could do TOFU but I digress). There's value in being able to verify the integrity of the artifacts in your local cache.

There are two things here:

1. Trusted Publishing provides a verifiable link between a CI/CD provider (the "machine identity") and a packaging index. This verifiable link is used to issue short-lived, self-scoping credentials. Under the hood, Trusted Publishing relies on a signature from the CI/CD provider (which is an OIDC IdP) to verify that link, but that signature is only over a set of claims about the machine identity, not the package identity.

2. Attestations are a separate digital signing scheme that can use a machine identity. In PyPI's case, we bootstrap trust in a given machine identity by seeing if a project is already enrolled against a Trusted Publisher that matches that identity. But other packaging ecosystems may do other things; I don't know how NPM's attestations work, for example. This digital signing scheme uses a different key, one that's short-lived and isn't managed by the IdP, so that signing events can be made transparent (in the "transparency log" sense) and are associated more meaningfully with the machine identity, not the IdP that originally asserted the machine identity.

> At the end of the day you just need to somehow end up in a situation where the pipeline holds a key that has been authenticated by the package registry. From that point on I'd think that the particular signature scheme would become a trivial implementation detail; you stuff the output into some json or something similar and get on with life.

Yep, this is what attestations do. But a key piece of nuance: the pipeline doesn't "hold" a key per se, it generates a new short-lived key on each run and binds that key to the verified identity sourced from the IdP. This achieves the best of both worlds: users don't need to maintain a long-lived key, and the IdP itself is only trusted as an identity source (and is made auditable for issuance behavior via transparency logging). The end result is that clients that verify attestations don't verify using a specific key; the verify using an identity, and ensure that any particular key matches that identity as chained through an X.509 CA. That entire process is called Sigstore[1].

And no offense taken, these are good questions. It's a very complicated system!

[1]: https://www.sigstore.dev

[-]

fc417fc802 3 hours ago

> I think it would be pretty risky to allow this key to directly sign for packages, because this would imply that any party that can use that key for signing can sign for any package.

There must be some misunderstanding. For trusted publishing a short lived API token is issued that can be used to upload the finished product. You could instead imagine negotiating a key (ephemeral or otherwise) and then verifying the signature on upload.

Obviously the signing key can't be shared between projects any more than the API token is. I think I see where the misunderstanding arose now. Because I said "just verify the pipeline identity" and you interpreted that as "let end users get things signed by a single global provider key" or something to that effect, right?

The only difference I had intended to communicate was the ability of the downstream consumer to verify the same claim (via signature) that the registry currently verifies via token. But it sounds like that's more or less what attestation is? (Hopefully I understood correctly.) But that leaves me wondering why Trusted Publishing exists at all. By the time you've done the OIDC dance why not just sign the package fingerprint and be done with it? ("We didn't feel like it" is of course a perfectly valid answer here. I'm just curious.)

I did see that attestation has some other stuff about sigstore and countersignatures and etc. I'm not saying that additional stuff is bad, I'm asking if Trusted Publishing wouldn't be improved by offering a signature so that downstream could verify for itself. Was there some technical blocker to doing that?

> the IdP itself is only trusted as an identity source

"Only"? Doesn't being an identity source mean it can do pretty much anything if it goes rogue? (We "only" trust AD as an identity source.)

[-]

woodruffw 11 minutes ago

> There must be some misunderstanding. For trusted publishing a short lived API token is issued that can be used to upload the finished product. You could instead imagine negotiating a key (ephemeral or otherwise) and then verifying the signature on upload.

From what authority? Where does that key come from, and why would a verifying party have any reason to trust it?

(I'm not trying to be tendentious, so sorry if it comes across that way. But I think you're asking good questions that lead to the design that we arrived at with attestations.)

> I did see that attestation has some other stuff about sigstore and countersignatures and etc. I'm not saying that additional stuff is bad, I'm asking if Trusted Publishing wouldn't be improved by offering a signature so that downstream could verify for itself. Was there some technical blocker to doing that?

The technical blocker is that there's no obvious way to create a user-originated key that's verifiably associated with a machine identity, as originally verified from the IdP's OIDC credential. You could do something like mash a digest into the audience claim, but this wouldn't be very auditable in practice (since there's no easy way to shoehorn transparency atop that). But some people have done some interesting exploration in that space with OpenPubKey[1], and maybe future changes to OIDC will make something like that more tractable.

> "Only"? Doesn't being an identity source mean it can do pretty much anything if it goes rogue? (We "only" trust AD as an identity source.)

Yes, but that's why PyPI (and everyone else who uses Sigstore) mediates its use of OIDC IdPs through a transparency logging mechanism. This is in effect similar to the situation with CAs on the web: a CA can always go rogue, but doing so would (1) be detectable in transparency logs, and (2) would get them immediately evicted from trust roots. If we observed rogue activity from GitHub's IdP in terms of identity issuance, the response would be similar.

[1]: https://github.com/openpubkey/openpubkey

pamcake 9 hours ago

Thanks for replying.

I'm certainly not meaning to imply that you are in on some conspiracy or anything - you were already in here clarifying things and setting the record straight in a helpful way. I think you are not representative of industry here (in a good way).

Evangelists are certainly latching on to the ambiguity and using it as an opportunity. Try to pretend you are a caveman dev or pointy-hair and read the first screenful of this. What did you learn?

https://github.blog/changelog/2025-07-31-npm-trusted-publish...

https://learn.microsoft.com/en-us/nuget/nuget-org/trusted-pu...

https://www.techradar.com/pro/security/github-is-finally-tig...

These were the top three results I got when I searched online for "github trusted publishing" (without quotes like a normal person would).

Stepping back, could it be that some stakeholders have a different agenda than you do and are actually quite happy about confusion?

I have sympathy for that naming things is hard. This is Trusted Computing in repeat but marketed to a generation of laymen that don't have that context. Also similar vibes to the centralization of OpenID/OAuth from last round.

On that note, looking at past efforts, I think the only way this works out is if it's open for self-managed providers from the start, not by selective global allowlisting of blessed platform partners one by one on the platform side. Just like for email, it should be sufficient with a domain name and following the protocol.

greggman65 13 hours ago

Rust and Python appear to still long lived ones so it's only a matter of time until they get the same issues it would seem?

[-]

woodruffw 3 hours ago

For whatever reason, we haven't seen the same degree of self-perpetuating credential disclosure in either Rust or Python as an ecosystem. Maybe that trend won't hold forever, but that's the distinguishing feature here.

jacquesm 4 hours ago

> I think a lot of the pains here are self-inflicted on GitHub's part here

It is spelled 'Microsoft'.

What did you think would happen long term? I remember when that acquisition happened and there were parties thrown all around, MS finally 'got' open source.

And never mind feeding all of the GitHub contents to their AI.

[-]

woodruffw 3 hours ago

My point was that these are political and logistical problems latent to GitHub/Microsoft/whatever, not to Trusted Publishing as a design. I don't think I materially disagree with you about Microsoft not having a sterling reputation.

[-]

jacquesm 3 hours ago

Yes, but I think that that more than anything is the driver behind these decisions.

[-]

woodruffw 9 minutes ago

Which ones? It wasn't a driver behind our decisions when we designed Trusted Publishing originally; the fact that GitHub has been such a mess has been a consistent source of tsuris in my life.

a day ago

[deleted]

pamcake 18 hours ago

"In its current form" is in context of NPM, where I think it's accurate.

Great to see PyPi taking a more reasonable path.

spankalee a day ago

I maintain some very highly used npm package and this situation just has me on edge. In our last release of dozens of packages, I was manually reading though our package-lock and package.json changes and reviewing every dependency change. Luckily our core libraries have no external dependencies, but our tooling has a ton.

We were left with a tough choice of moving to Trusted Publishers or allowing a few team members to publish locally with 2FA. We decided on Trusted Publishers because we've had an automated process with review steps for years, but we understand there's still a chance of a hack, so we're just extremely cautious with any PRs right now. Turning on Trusted Publishers was a huge pain with so many package.

The real thing we want for publishing is for us is to be able to continue to use our CI-based publishing setup, with Trusted Publishers, but with a human-in-the-loop 2FA step.

But that's only part of a complete solution. HITL is only guaranteed to slow down malicious code propagating. It doesn't actually protect our project against compromised dependencies, and doesn't really help prevent us from spreading it. All of that is still a manual responsibility of the humans. We need tools to lock down and analyze our dependencies better, and tools to analyze our our packages before publishing. I also want better tools for analyzing and sandboxing 3rd party PRs before running CI. Right now we have HITL there, but we have to manually investigate each PR before running tests.

yoan9224 5 hours ago

The comment about maintainers not getting paid resonates. I'm a solo founder and these security changes, while necessary, add real friction to shipping.

The irony is that trusted publishing pushes everyone toward GitHub Actions, which centralizes risk. If GHA gets compromised, the blast radius is enormous. Meanwhile solo devs who publish from their local machine with 2FA are arguably more secure (smaller attack surface, human in the loop) but are being pushed toward automation.

What I'd like to see: a middle ground where trusted publishing works but requires a 2FA confirmation before the publish actually goes through. Keep the automation, keep the human gate. Best of both worlds.

The staged publishing approach mentioned in the article is a good step - at least you'd catch malicious code before it hits everyone's node_modules.

herpdyderp a day ago

The shift wouldn't have been so turbulent if npm had simply updated their CLI in tandem. I still can't use 2FA to publish because their CLI simply cannot handle it.

[-]

pamcake 18 hours ago

CLI publishing with TOTP 2FA worked fine until they broke it.

jonkoops 9 hours ago

I really think that the main issue is that NPM itself will execute any script that is in the "postinstall" section of a package, without asking the user for permission. This is a solved problem in other package managers, e.g. PNPM will only run scripts if the user allows them to, and store the allowlist in the package.json file for future reference.

In this scenario, if a dependency were to add a "postinstall" script because it was compromised, it would not execute, and the user can review whether it should, greatly reducing the attack surface.

[-]

fergie 7 hours ago

There are a large subset of security problems that are solved by simply eliminating compilation steps typically included in "postinstall". If you want a more secure, more debuggable, more extensible lib, then you should definitely publish it in pure js (rather than, say, Typescript), so that there is no postinstall attack surface.

[-]

WorldMaker 4 hours ago

With type stripping in Node LTS now there's no reason at all to have a postinstall for Typescript code either. There's fewer reasons you can't post a "pure TS" library either.

m90 8 hours ago

Wouldn't this just make the number of packages that can be targeted smaller? E.g. I publish a testrunner that needs to install Headless Chrome if not present via postinstall. People trust me and put the package on their allowlist. My account gets compromised and a malicious update is published. People execute malicious code they have never vetted.

I do understand this is still better than npm right now, but it's still broken.

[-]

jonkoops 5 hours ago

Security is layered, no layer will conclusively keep you safe, but each one make it harder to pierce to the core. For example, the impact of the recent SHA1-Hulud attack would be much less, as compromised packages (that previously did not have any scripts executing at install time), would not suddenly start executing, as they are not allowlisted.

acdha 7 hours ago

Security is usually full of incremental improvements like that, however. Reducing the scope from all of NPM to the handful of things like test runners would be an enormous benefit for auditors and would encourage consolidation (e.g. most testing frameworks could consolidate on a single headless chrome package), and in the future this could be further improved by things like restricting the scope of those scripts using the operating system sandbox features.

sarreph 8 hours ago

I really don't think this should be a registry-level issue. As in, the friction shouldn't be introduced into _publishing_ workflows, it should be introduced into _subscription_ workflows where there is an easy fix. Just stop supporting auto-update (through wildcard patch or minor versions) by default... Make the default behaviour to install whatever version you load at install time (like `npm ci` does)

dmarwicke 5 hours ago

2FA publishing still doesn't work for me. just use legacy tokens at this point, gave up trying to figure out what's wrong

lloydatkinson a day ago

Just to be clear, "trusted publishing" means a type of reverse vendor lock in? Only some CI systems are allowed to be used for it.

[-]

woodruffw 20 hours ago

"Trusted Publishing" is just a term of art for OIDC. NPM can and should support federating with CI/CD platforms other than GitHub Actions, to avoid even the appearance of impropriety.

(It makes sense that they'd target GHA first, since that's where the majority of their users probably are. But the technique itself is fundamentally platform agnostic and interoperable.)

[-]

thayne 18 hours ago

Currently only GHA and Gitlab are supported.

LtWorf a day ago

Yes. You cannot set up your own.

pyrolistical 20 hours ago

Seems like requiring 2FA to publish or trusted publishing should prevent the vast majority of this issue.

The only tricky bit would be to disallow approval own pull request when using trusted publishing. That should fall back to requiring 2FA

[-]

thayne 18 hours ago

It also make it impossible to publish using CI, which is problematic for projects with frequent releases. And trusted publishing doesn't solve that if you use self-hosted CI.

[-]

fc417fc802 14 hours ago

> trusted publishing doesn't solve that if you use self-hosted CI

Is there any particular reason for the whitelist approach? Standing on the sidelines it appears wholly unnecessary to me. Authentication that an artifact came from a given CI system seems orthogonal to the question of how much trust you place in a given CI system.

[-]

thayne 14 hours ago

Well, given that Github owns NPM, one potential reason could be vendor lock in.

Also, from an implementation standpoint it is probably easier to make a system that just works for a handful of OIDC providers, than a more general solution. In particular, a general solution would require having a UI and maybe an API for registering NPM as a service provider for an identity provider of the package owner's choice.

[-]

fc417fc802 13 hours ago

Is OIDC federation really so involved as to require a nontrivial registration step? Shouldn't this be an OAuth style flow where you initiate with the third party and then confirm the requested permissions with the service? Where did it all go wrong?

[-]

woodruffw 4 hours ago

Each OIDC provider has its own claim formats, which Trusted Publishing needs to be aware of to accurately determine which set makes up a sufficient "identity" for publishing purposes. That's not easily generalizable across providers, at least not until someone puts the sweat and tears into writing some kind of standard claim profile for OIDC IdPs that provide CI/CD machine identities.

(This is also only half the problem: the Relying Party also needs to be confident that the IdP they're relying on is actually competent, i.e. can be trusted to maintain a private key, operationalize its rotation, etc. That's not something that can easily be automated.)

thayne 4 hours ago

There needs to be a way for the user to tell NPM whoch IDP to trust. And then, not all IDPs support automated registration.

cvbnmb 18 hours ago

I think this turbulent shift is going to push a lot of node devs elsewhere.

I understand things need to be safe, but this is a recklessly fast transition.

ajross 21 hours ago

I try to make this point when the subject comes up, but IMHO this is a lipstick-on-a-pig solution. The problem with npm security isn't stability or attestation, it's the impossibility of auditing all that garbage.

The software is too fine-grained. Too many (way too many) packages from small projects or obscure single authors doing way too many things that are being picked up for one trivial feature. That's just never going to work. If you don't know who's writing your software the answer will always end up being "Your Enemies" at some point.

And the solution is to stop the madness. Conglomerate the development. No more tiny things. Use big packages[1] from projects with recognized governance. Audit their releases and inclusion in the repository from a separate project with its own validation and testing. No more letting the bad guys push a button to publish.

Which is to say: this needs to be Debian. Or some other Linux distro. But really the best thing is for the JS community (PyPI and Cargo are dancing on the edge of madness too) to abandon its mistake and move everything into a bunch of Debian packages. Won't happen, but it's the solution nonetheless.

[1] c.f. the stuff done under the Apache banner, or C++ Boost, etc...

[-]

gr4vityWall 7 hours ago

Agreed on the "this needs to be Debian" part. If some of the most popular JS packages were available through the system package manager as normal *.deb packages, I think people would be more likely to build on top of stable versions/releases.

Stability (in the "it doesn't change" sense) is underrated.

12345ieee 19 hours ago

The fact is that being Debian is boring, and JS (python/rust/...) is *cool*.

Give it a few more decades, hopefully it'll be boring by then, the same way, say, making a house is boring.

[-]

immibis 13 hours ago

> the same way, say, making a house is boring.

The government will mostly ban it to keep prices high?

TZubiri 18 hours ago

I've been thinking, Java doesn't have many supply chain issues, their model is based of namespacing with the DNS system. If I want a library from vendor.com, the library to import is somewhere under com.vendor.*

Simple enough, things like npm and pip reinvent a naming authority, have no cost associated (so it's weak to sybil attacks), all for not much, what do you get in exchange? You create equality by letting everyone contribute their wonderful packages, even those that don't have 15$/yr? I'm sorry was the previous leading internet mechanism not good and decentralized enough for you?

Java's package naming system is great in design, the biggest vuln in dependencies that I can think of on java was not a supply chain specific vuln, but rather a general weakness of a library (log4j). But maybe someone with more java experience can point to some disadvantage of the java system that explains why we are not all copying that

[-]

woodruffw 18 hours ago

I think Java’s DNS namespacing is, at best, only a weak benefit to the supply chain security posture of Java packaging as a whole. I think it’s more that Java is (1) a batteries-included language, (2) lacks the same pervasive open source packaging culture that Python, Rust, JS, etc. have, (3) is much more conservative around dependency updates as a community, and (4) lacks a (well-known?) build time code execution vector similar to JS’s install scripts or Python’s setup.py.

(Most of these are good things, to be clear!)

[-]

fc417fc802 14 hours ago

> lacks a (well-known?) build time code execution vector similar to JS’s install scripts or Python’s setup.py

How is that leveraged by attackers in practice? Naively I would expect the actual issue to be insufficient sandboxing (network access in particular).

[-]

woodruffw 8 hours ago

All of the recent “Shai-Hulud” attack waves leveraged build-time execution, since it’s a reliable way to actually execute code on a target (unlike putting the payload in the dependency itself, since the dependency’s own code might not run until much later.)

Sandboxing would be a useful layer of defense, but it’s not a trivial one to add to ecosystems where execution on the host is already the norm and assumption.

[-]

fc417fc802 8 hours ago

I suppose I can understand the backwards compatibility angle. However at least personally I'm of the view that anything accessing the network during a build should be killed with fire. I draw a hard line against using dependencies that won't build in a network isolated environment.

[-]

woodruffw 7 hours ago

Yeah, I think forbidding network access within build systems is would be a great default to employ.

(I wouldn’t be surprised to learn that a large number of packages in Python do in fact have legitimate network build-time dependencies. But it would be great to actually be able to quantify this so the situation could be improved.)

[-]

TZubiri 7 hours ago

Is it really legitimate to have build time network deps? It just means the full source wasn't published and there's some hidden source being downloaded

[-]

woodruffw 6 hours ago

I don’t know, I don’t have a value position on it. I just think it does happen as a matter of course.

(Legitimate seems like a gray area to me — it’s common for applications to have a downloadable installer that then bootstraps the actual program, for example. Is this good or bad? I don’t know!)

immibis 13 hours ago

If the attacker can't run code, does it matter whether they're not running code inside or outside of a sandbox?

[-]

fc417fc802 12 hours ago

If you encase your computer in a block of cement an attacker will have great difficulty breaking into it. Nevertheless it might be useful to know if previous break ins were facilitated by a buffer overflow, a misconfiguration, or something else. Probably you can arrive at solution that is reasonably secure while being significantly more user friendly than the 55 gallon drum filled with a soon to be solid.

More seriously - scenarios that call for executing arbitrary tools during a build are common, an increasing number of languages enjoy compile time code execution, and quite a few of those languages don't go out of their way to place any restrictions on the code that executes (many lisps for example).

rectang 16 hours ago

Thanks for this insight-dense comment — and for all the efforts you have put into Trusted Publishing.

TZubiri 16 hours ago

There being a compile/runtime difference at all seems quite impactful to dependency mgmt as a whole apparently, I've seen impacts in bc, build times and now security.

mrguyorama 2 hours ago

The primary way supply chain issues in Java are addressed is the very simple way: You don't have a large supply chain.

You have one or two megalibraries that are like 20 years old and battle tested and haven't really changed in forever.

Then you have a couple specific libraries for your very specific problem.

Then, you pin those versions. You probably even run your own internal repo for artifacts so that you have full control over what code you pull in to your CI

But none of this actually prevents supply chain attacks. What it does is drastically lower their profitability and success.

Lets say you magically gain access to the Spring Boot framework's signing keys. You put out a malicious version that will drop persistent threats and backdoors everywhere it can and pulls out any credit card numbers or whatever it can find. The team behind Spring Boot takes like two weeks to figure it out, notify the breach, and take down the malicious code.

How many actual systems have even pulled that code in? Very few. Even a significant supply chain attack still requires significant luck to breach targets. In NPM land, this is not the case, and tons of things are pulling in the "latest" version of frameworks. You are much more likely to get someone to actually run your malicious code.

m4rtink a day ago

Can we finally decare this (and other incomplete language specific package namanegers) to be a failed experiment and go back to robust and secure distro based package management workflow, with maintainers separate from upstream developpers ?

[-]

no_wizard a day ago

Its a false belief that distro based package management workflows are, or ever were, more secure. Its the same problem, maybe one step removed. Look at all the exploits with things like libxz

There was also the python 2.7 problem for a long time, thanks to this model, it couldn't be updated quickly and developers, including the OS developers, became dependent on it being there by default, and built things around it.

Then when it EOL'd, it left alot of people exposed to vulnerabilities and was quite the mess to update.

Macha a day ago

The robust and secure distro based package management workflow that shipped the libxz backdoor to everyone, and broke openssh key generation, and most of the functionality of keepassxc?

[-]

TZubiri 18 hours ago

>workflow that shipped the libxz backdoor to everyone

Isn't it the case that it didn't ship the backdoor? Precisely because of the thorough testing and vetting process?

[-]

Macha 9 hours ago

No, it shipped in Debian Sid, OpenSUSE Tumbleweed and Fedora Rawhide, along with beta versions of Ubuntu 24.04 and Fedora 40. Arch also shipped it but the code looked for rpm/apt distros so the payload didn’t trigger.

It was caught by a Postgres developer who noticed strange performance on their Debian Sid system, not by anyone involved with the distro packaging process.

[-]

gr4vityWall 7 hours ago

In other words, it didn't hit any people running Stable distros, only users on Beta versions or rolling releases.

Sounds like an improvement - having beta builds for people to catch those before they arrive in a stable GNU distribution seems the ideal workflow at glance.

[-]

graemep 3 hours ago

On top of that the number of such issues is tiny compared to language distros.

Distro packaging is not perfect, but it is much, much better.

TZubiri 7 hours ago

App devs are part of the distro release process. They verify stability with other packages.

It's OS, it's a collab endeavour

arccy a day ago

where do you get all these trusted people to review your dependencies from?

it can't be anyone, because you're essentially delegating trust.

no way there's enough trustworthy volunteers (and how do you vet them all?)

and who's going to pay them if they're not volunteers?

sunshowers 20 hours ago

Language-specific package managers are a natural outgrowth of wanting portability across platforms.

rtpg a day ago

When distros figure out how I can test my software with a dep at version A and the same dep at version B in a straightforward way, then we can talk.

NPM forcing a human to click a button on release would have solved a lot of this stuff. So would have many other mitigations.

ashishb a day ago

I run them inside a sandbox.

The npm community is too big that one can never discard it for frontend development.

echelon a day ago

Never in a million years.

Rust's Cargo is sublime. System apt / yum / pacman / brew could never replace it.

Cargo handles so much responsibility outside of system packages that they couldn't even come close to replicating the utility.

Checking language versions, editions, compiling macros and sources, cross-compiling for foreign architectures, linking, handling upgrades, transient dependency versioning, handling conflicts, feature gating, optional compilation, custom linting and strictness, installing sidecars and cli utilities, etc. etc.

Once it's hermetic and namespaced, cargo will be better than apt / yum / etc. They're not really performing the same tasks, but cargo is just so damned good at such important things that it's hard to imagine a better tool.

[-]

socalgal2 11 hours ago

It's got all the same issues as npm though. The fact that it's so cool makes it a magnet for adding deps. Rust's own docs generator pulls in > 700 deps

bikelang 19 hours ago

Why is JS, in particular, so deeply afflicted with these issues? Why hasn’t there been an effort to create a more robust standard library? Or at least first party libraries maintained by the JavaScript team? That way folks can pull in trusted deps instead of all the hodgepodge.

Go did a lot wrong. It was just awful before they added Go modules. But it’s puzzling to me to understand why as a community and ecosystem its 3rd party dependencies seem so much less bloated. Part of it I think is because the standard library is pretty expansive. Part of it is because of things like golang.org/x. But there’s also a lot of corporate maintainers - and I feel like part of that is because packages are namespaces to the repository - which itself is namespaced to ownership. Technically that isn’t even a requirement - but the community adopted it pretty evenly - and it makes me wonder why others haven’t.

[-]

thayne 18 hours ago

Javascript is a standard with many implementations. Any addition to the "standard library" (such as it is) has to go through a long process to get approved by a committee, then in turn implemented by at least the major implementations (v8, SpiderMonkey, JavascriptKit).

> Or at least first party libraries maintained by the JavaScript team?

There is no "JavaScript team".

pamcake 7 hours ago

Mainly I think because of the scale it's used at, the things it's used for, and the people who use it.

Technicalities are secondary to those factors.

If Ruby was the only language that ran in the browser, you'd be writing the same rant about Ruby, no matter the stdlib.

bblaylock 19 hours ago

There has been an effort to both replace npm with a better model and to have a stable standard library. See https://jsr.io/@std

[-]

gr4vityWall 7 hours ago

Last time I checked, the only way to publish to it was through a GitHub account. I hope support for other providers got/will get added.

bikelang 16 hours ago

Is the ambition to lift this out of Deno? Bun brings its own standard lib too right? Are the two coordinating at all?

[-]

WorldMaker 4 hours ago

The stuff on JSR is lifted out of Deno. JSR can install packages for Node and Bun [0]. Most of the "@std" packages in the link above claim support for Bun (the right hand of the package list "stack of avatars" will have the Bun avatar; easier to read on individual package pages where it becomes a header, though), and there is a Bun test matrix in the GitHub Actions CI. (Right now it looks like it just has Bun latest in the matrix, though.)

In terms of coordination, I don't see any obvious Bun contributors in a quick skim [1], but it seems open to contribution and is MIT licensed.

[0] https://jsr.io/docs/using-packages#adding-a-package

[1] https://github.com/denoland/std/graphs/contributors