Show HN: Artifact Keeper – Open-Source Artifactory/Nexus Alternative in Rust

(github.com)

125 points | by bsgeraci 14 hours ago ago

43 comments

j1elo 32 minutes ago

Question about the license choice: now that we're past so many projects that started up as FOSS but with a longer-term plan of monetization and/or corporate-tier support level, which saw their choice of license allowing other bigger players to just get the code and run a competing service with proprietary extensions (which is what something so exceedingly open as MIT allows), isn't there any worry that this could happen again here?

I'm curious if AGPL shouldn't be more common (even though it's not a silver bullet), but MIT projects with foreseeable needs of some monetization to survive long term never ceased to show up, despite so many FOSS drama in the last couple years.

antonyh 10 hours ago

I appreciate the honesty about using Claude and the time it took to build this, and it shows how things can look when guided by someone who knows what they are doing.

On the other hand, it also shows that it took three weeks, so why should I use this instead of building a custom toolchain myself that is optimised for what I need and actually use? Trimming away the 45+ formats to the 5 or so that matter to my project. It raises the question - is 'enterprise' software doomed in favour of a proliferation of custom built services where everybody has something unique, or is the real value in the 'support' packages and SLAs? Will devs adopt this and put 'Artifact Keeper' on their CV, or will they put 'built an artifact toolchain with Claude'?

But then again, kudos to you for building something that can (and probably should) eat the lunch of the enterprise-grade tools that are simply unaffordable to small business, individual contractors, and underfunded teams. Truth be told, I'm not going to build my own, so this is certainly something I want to put in a sandbox and try out, and also this is inspirational and may finally convince me that I should give Claude a fair go if it's capable of being guided to create high quality output.

[-]

0x457 an hour ago

Coding agents changed "build vs buy" dynamics in my opinion. Hopefully it will result in SaaS dropping pay-gating SSO.

esafak 2 hours ago

Why would you re-invent the wheel? Are the existing options that bad?

raphinou 10 hours ago

I'm impressed with the speed of development. I didn't take a look at the quality of the code though. I'm using glm a Kimi k2.5, and I have a lot of corrections to apply to the code. Is Claude that better? Or is my process bad? OP: what's your development process?

[-]

antonyh 9 hours ago

I've not done enough Rust to truly know, but it looks reasonable from looking at the tests, a few models, some implementation code.

It doesn't use the 'unsafe' keyword anywhere, but that's not necessarily an indicator. Uses unsafe-libyaml which is like what it sounds (a hacky port of libyaml) but is no longer maintained (archived on GH in March 2024), and may have better choices. An SBOM would highlight these dependencies better than me doing random searches through the code.

I'm not sure I'd have put a default in the OIDC callback to localhost, that's about the only thing I've seen in a quick 5-minute skim through. I do like the comments and the lack of emojis :-)

I too would like to know the process, if OP is willing to share.

[-]

bsgeraci 8 hours ago

I have had claude go back and forth with codesimplifier agent (they developed) and a security agent.

I think adding this to your workflow helps but you have to make sure to have end to end testing on the mind. Because some changes can break things real fast.

My process is pretty plain outside of paying anthropic too much money a month. Only thing extra I am using is the beads currently. I was using speckit and ralph-loop but as of last week it does not seem to be needed. THink anthropic is baking some of thes tools into claude code.

[-]

antonyh 7 hours ago

Sounds really clean and simple, combined with classic developer diligence and hard effort to get it built right. Thanks for sharing.

bsgeraci 8 hours ago

Claude is... unfortunetly... that much better. They really know how to use the tools that integrate into CLI that just makes the flow so much better.

The only extra stuff I am doing now is beads. https://github.com/steveyegge/beads

I was using speckit and ralph-loop but think anthropic baked in that ralph-loop. Basically a dumb while true until you break with the condition.

bsgeraci 9 hours ago

I would say do not trust it, but use it and try it. Hopefully over time I can build trust by people using it.

[-]

antonyh 7 hours ago

Trust it to proxy artifacts from the web? Yes I think so.

Trust it not to leak credentials? No, that's something that is never taken for granted.

Trust it to hold a full history of uploaded binaries? That depends on the value of the releases. For incubator work, or web projects, or even Appstore apps where it's released to those stores to manage, maybe there should be enough trust. I just wouldn't use it for code where I want access to many stable versions, and I wouldn't put it publicly on the web either - not that I would do so with Sonatype Nexus without vendor support and many safeguards. I think it'll earn trust over time, once folk are convinced to use it for real workloads.

There's a lot of forms of trust.

stroebs 11 hours ago

I’m a fairly heavy user of the JFrog platform with Enterprise+, Xray, their new Curation license, and my org is spending in excess of $500k/year on Artifact storage. Not including my time babysitting it. I’d love to see the end of it, and I hope you manage to build a community around this.

Part of the reason we pay the big license fee is so we have someone to turn to when it inevitably breaks because we’ve used it in a way nobody has before. In Jan last year we were using 30TB of artifact storage in S3. That’s 140TB today.

Where do you get your CVE data? Would built artifacts have their CVEs updated after the fact? Do you have blocking policies on artifacts based on CVEs, licenses, artifact age, etc?

[-]

bsgeraci 9 hours ago

I am using the openscap and trivia. Can you add a discussion to my github about some of this. I would love some of your feedback on what you need on your level. I need to check the update mechanism so we are keeping the database up to date. I also want a way to keep it up to date when it is airgapped, not everyones use case but one I have delt with at my jobs.

I still need to put some e2e testing on those policies. https://demo.artifactkeeper.com/security/policies here is a demo and you can add a policy. Again that one I need to make a series of end to end testing but that was designed in mind :) I really want a staging area and promotion of packages after scans.

On my list of things to do.

[-]

stroebs 7 hours ago

I'll carve out some time to add a discussion as I've become quite passionate about artifact storage in the last 18 months as a result of having to look after this behemoth. Air-gapping is also pretty important - JFrog supports granular proxy specification by repo.

It's a great start. What I can say is that granularity of CVE's in policies will become important for larger consumers. We have about 4.5mn artifacts so even getting CVSSv3 10's blocked was a challenge, let alone 9.8.

raphinou 11 hours ago

I looked at your profile but didn't seen any contact info, hence this comment. I'm working on a fully open source multisig solution for artifact authentication. I would be interested to have your opinion and if you see opportunities for such a project in companies as the one you work for now to make the project financially sustainable. Can you contact me? (Email in my profile)

Edit: the project if anyone reading this is interested: http://github.com/asfaload/asfaload (looking for feedback!)

eyeris 11 hours ago

Since the cve data is from Trivy/Grype, that should be osv.dev

M0r13n 11 hours ago

JFrogs platform is fairly robust. Only time will tell if this project can keep up. I highly doubt it's more than a fancy-looking prototype at this stage

[-]

gjvc 9 hours ago

tell me mr armchair general, what have you done that's worth talking about?

[-]

M0r13n an hour ago

My comment was not intended to be any criticism or to downplay the performance - quite the opposite :)

bsgeraci 9 hours ago

I think it is right to be skeptectial and I hope this project can prove people wrong.

moezd 10 hours ago

Unfortunately I'm also in the same camp, with SBOM generation, Xray, Curation, the whole shebang. I couldn't find these in the docs as well, which would matter in my case.

[-]

bsgeraci 8 hours ago

Ok updated docs https://artifactkeeper.com/docs/security/scanning/

SHould have info on the CVE, please leave some issues on the repository if you want to see more infromatoin on the actual dashbaord/ui :)

Thanks for the feedback!

bsgeraci 9 hours ago

I will add some todo's for me. I know how important this is.

kamma4434 11 hours ago

I have been looking for ways to only use local packages for our software builds. I am looking for something that can act as a local cache for Java and NPM packages. The idea would be that developers can only use packages belonging to the allowed set for development, and there is a vetting process where packages are added to the allowed set (or removed).

I have been playing with the idea of using a single git repository to host them, Java packages as an Ivy repository and JavaScript packages as simply the contents of node_modules.

Anybody does something similar?

no_circuit 5 hours ago

Impressive looking project generated with AI help. Have similar goals of having an artifacts system myself.

I think the approach of multi-format, multi-UI, and new (to you) programming language isn't optimal even with AI help. Any mistake that is made in the API design or internal architecture will impact time and cost since everything will need to be refactored and tested.

The approach I'm trying to take for my own projects is to create a polished vertical slice and then ask the AI to replicate it for other formats / vertical slices. Are there any immediate use cases to even use and maintain a UI?

So a few comments on the code:

- feature claims rate limiting, but the code seems unused other than in unit tests... if so why wasn't this dead code detected?

- should probably follow Google/Buf style guide on protos and directory structure for them

- besides protos, we probably need to rely more on openapi spec as well for code generation to save on AI costs, I see openapi spec was only used as task input for the AI?

- if the AI isn't writing a postgres replacement for us, why have it write anything to do with auth as well? perhaps have setup instructions to use something like Keycloak or the Ory system?

figmert 9 hours ago

I've been wanting something like this that isn't artifactory (I've ran it in previous companies, it's not a great experience), so I had been thinking of doing it myself, but never bothered. One idea I had is to write a proxy that essentially translates the various package manager endpoints into OCI registry, thus causing everything to be stored on any OCI backend. My thinking was this way you could in theory use any OCI backend (including ready available, battle-tested self-hosted applications), but this proxy would never need it's own state, thus making it (hopefully) easier to run.

Now that you've implemented, was there a reason you didn't go for such an approach so that you would worry about less as someone hosting something like this?

[-]

bsgeraci 7 hours ago

I have used oras, and that might be an interesting approach but not sure how it would work with search , version listing, dependenciton resolution, metadata quirere, permissions, and audit logs.

Are you sugesting some hybrid approach?

the_harpia_io 6 hours ago

The Trivy + Grype combo is interesting - in my experience they catch different things, especially on container scanning vs dependencies. You see them disagree much on severity?

Re: the vibe coding angle - the thing I keep running into is that standard scanners are tuned for human-written code patterns. Claude code is structurally different. More verbose, weirdly sparse on the explicit error handling that would normally trigger SAST rules. Auth code especially - it looks textbook correct and passes static analysis fine, but edge cases are where it falls apart. Token validation that works great except for malformed inputs, auth checks that miss specific header combinations, that kind of thing.

The policy engine sounds flexible enough that people could add custom rules for AI-specific patterns? That'd be the killer feature tbh.

nullocator 4 hours ago

I see that this supports wasm plugins which is a neat feature, have you considered adding support for wasm plugins stored as oci images potentially in the registry itself? I looked at the documentation and it didn't seem like this was an option.

[-]

bsgeraci 3 hours ago

That is a great idea!

I made a discussion here :) I can keep you posted but if you have a github account just make a commet there follow so you stay up to date with that change.

https://github.com/orgs/artifact-keeper/discussions/36

visualphoenix 8 hours ago

Can this do 302 redirect to s3? One neat feature of artifactory edge is that the asset download can skip hitting the edge peer and go straight to s3.

Would be cool if this also could support the existing artifactory s3 backend format so you could just point this at your existing artifactory s3 bucket and migrate your db over to this.

Congrats on launching!

[-]

bsgeraci 5 hours ago

https://github.com/orgs/artifact-keeper/discussions/34

I made this discussion here. Please jump on github and add some comments and maybe we can get this added :)

jurgenburgen 8 hours ago

> Security scanning, SSO, replication, WASM plugins — it's all in the MIT-licensed release. No enterprise tier. No feature gates. No surprise invoices.

I think it’s cool that the OSS version has everything but I hope you’re considering adding an actual enterprise tier for paid support because from my past experience that’s the killer feature large enterprises care about.

If your OSS service becomes a mission-critical service (what an artifact repository usually is), a large org will anyways have to invest into a team that can operate and own it.

If throwing some money at the vendor takes away some of the responsibility (= less time spent by in-house team on ops) then paying for an enterprise support SLA is a feature, not a bug.

It would be great to see more competition in the space even though my current team isn’t working with this problem!

cadamsdotcom 6 hours ago

Mad props on building with Claude Code but doing thoughtful design, and using tests to take yourself out of the loop but still thoughtfully architecting the important bits.

These tools can’t architect clean solutions that cut out massive chunks of code, and they can’t talk to users and decide whether what they’re building makes sense. For that, we need a human touch.

But coding agents grant insane leverage if they’re just told when they got it wrong and given a chance to get it right.

mrmattyboy 3 hours ago

Honestly, this is just awesome.

I've spent quite a long time looking at artifact storage, both for work and for personal use and this project literally scratches that itch. So featureful (assuming they're not placeholders ;) ) and yes, Claude Code, but still - the proof will be in whether it works (and how clean the codebase feels - you're making it sound promising :D ).

Very excited to try this - well done :)

jamesvnz 9 hours ago

Nice work.. I'm building the same thing right now. Partly because we need this and don't have the budget for Artifactory etc., and mainly to test out largely hands free, agentic development.

[-]

bsgeraci 8 hours ago

Feel free to use what I am building but I also think more people just need to try and build something. We are almost in a star trek style world where you are talking to a computer to make a holo deck promgram :) sorry for the trekkie talk.

My recommendation with testing out hands free agentic, know it is not fully hands free. I find my self babysitting alot of terminals going at once, like having a bunch of interns or junior devopers.

It is important to plan plan plan.

I want to eventually switch and play with self hosted models but for most agentic stuff Claude is killing it in terms of results.

seabass-salmon 5 hours ago

Long-term Nexus custodian here. Last year's licence rugpull by Sonatype had be thinking the same. I particularly loathe their new front page "malware" warning saying you have to contact them to find out what it is. Sure.

I've read the main readme so excuse if comments are covered already but key features and/or opportunities: - backend supporting Azure (Nexus has this under Pro though community does support S3 under community at least) - clear navigable S3 structure that could be sorted by a human if needed, like the on-disk backend of Nexus 2 used to have, not like Nexus' current organisation/obfuscation (which would be understandable but for...) - maintenance routines that actually work (Nexus' are a joke and very limited features for both cleanup and the task set leaving ever growing detritus). - having an automatically take the latest from upstreams is a big problem in the npm world; it would be a perfect fit to introduce this kind of staging concepts and window on upstream (proxied) repos - needs restful APIs and deep links to artifacts for ease of integration - we end up proxying other sources of files in a web proxy since there's no easy "pass through" via Nexus where we don't want to copy the current files into our DB or S3 but just want to pass the latest to the consumer. a direct proxy feature with URL remapping would be cool

Things I'd have to play around to understand what it does currently: - whether it has proper proxy and group support; composition is completely essential - whether that caching is sensible there (Nexus does a poor job, though it's a hard problem, when bad states get cached) - efficient (Maven) metadata generation (Nexus is abysmally slow) - whether rbac is clear over the repo structures (Nexus does ok here except everything is repo level AND the initial setup is very painful). - P2 consumption looks to be a supported format but P2 hosting I think was nerfed after Nexus v2.11 and some clients still use that - rpms added ("yum" to Nexus) but as with repo hierarchies would need to be assured they can be nested and will correctly produced merged repomd.xml and the like so they function properly

other comments: - having the security scanning in an open source tool would be amazing - it would be very hard to get clients to trust this without either a community and review process or a company (that "can be sued") behind it. I know it's very early days but it's a bit chicken and egg as if I can't use this on clients I wouldn't use for anything. Not that I am a valuable customer by myself, but I influence clients decisions who then need that support

[-]

bsgeraci 5 hours ago

My graduate research focused on common computer security misconceptions — one of the biggest being that open source is inherently insecure. The reality is the opposite. The algorithms and systems we trust most are the ones that have been open to public scrutiny. AES was selected through an open competition where every candidate was published for the world to attack. TLS, SHA-256, RSA — none of these are secret.

Their security comes from transparency and years of public audit, not obscurity. The same principle applies to software. I see the legal argument for wanting a vendor to sue, and I've thought about something like Canonical's model for Ubuntu — offering paid support around a free product. But I don't have years of production use behind this yet. We all start somewhere. So for now, this stays open and free for everyone to use, and for me and others to maintain.

imcritic 7 hours ago

After reading the header - I had a glimmer of hope.

westurner 3 hours ago

> native Swift (iOS/macOS) and Kotlin (Android) apps

CLI with journal of instructions, TUI?

burakemir 11 hours ago

Thanks for sharing.

westurner 3 hours ago

Notes for solvers in this space;

Fedora recently moved to managing packages in Forgejo, a fork of Gitea and Gogs, a clone of the old GitHub UI. https://news.ycombinator.com/item?id=45670055

Forgejo has an artifact registry for DEBs, RPMs, APKs,; and a Container Registry for OCI Containers.

Any type of artifact can be stored in an OCI container image registry. Any type of artifact can be signed/attested to with a short-lived signing key from sigstore.dev's or a self-hosted Rekor instance

Native container tools like bootc store host system images as a OCI container images.

From https://news.ycombinator.com/item?id=44991636 :

> bootc-image-builder, ublue-os/image-template, ublue-os/akmods, ublue-os/toolboxes w/ quadlets and systemd

There are streaming container standards to boot containers that haven't finished downloading yet, and container shapshot artifacts too; Seekable OCI, eStargz, Nydus: https://news.ycombinator.com/item?id=45270468

...

Forgejo can mirror git repos regularly or manually.

"Tell HN: GitHub will delete your private repo if you lose access to the original" re: `git clone --mirror` https://news.ycombinator.com/item?id=34603593

Python Packaging User Guide > Package index mirrors and caches > Existing projects: https://packaging.python.org/en/latest/guides/index-mirrors-...

> [ Cache, Mirror, Proxy ]

> [ mod_cache_disk (Apache), nginx_pypi_cache, pulp-python, ]

Pulp (RedHat,) mirrors and proxies a number of different types of packages. https://github.com/pulp

pulp_container, pulp_ostree, pulp_ansible, pulp_rpm, pulp_deb, pulp_npm, pulp_maven, pulp_r

pulp-operator for HA SPOF with k8s: https://github.com/pulp/pulp-operator

From https://news.ycombinator.com/item?id=44320936 re: cosign, Sigstore, TUF, SLSA; you have to pass this to get docker to check container image signatures

  DOCKER_CONTENT_TRUST=1

- integrate with Forgejo

- mirror git repos

- consider pulp's modular approach and deployment operator

- consider OCI for future packaging formats

- What SLSA recommends; check TUF, Sigstores, Trusted Publisher (OIDC) and GPG .asc signatures

And then also content-addressable networking might avoid some of the overhead and wasteful redundancy to checking the hash of each file in each signed package manifest.