Show HN: Building better base images

(github.com)

38 points | by akrylov 3 months ago ago

15 comments

mubou 3 months ago

I'm not really understanding what this does specifically. It looks like it creates the filesystem on the host machine using chroot and then tarballs it?

Is there an advantage to that over combining layers and using cache mounts to avoid those redundant downloads?

A side-by-side comparison of dive's output would be helpful (https://github.com/wagoodman/dive).

Also can you clarify what you mean by "requiring full rebuilds of all previous steps"?

[-]

akrylov 3 months ago

It’s basically just a fancy bash script (mkimage.sh) and Makefiles for calling scripts with different sets of paramaters. The process – is the same exact process of creating base docker images – chroot and using package manager apt or yum to install packages in chroot jails. That is how ubi9 or debian slim base images are made. With this tool you can extend the process – install dependencies, run security checks, sign it all in one go. It’s easy to extend it, so you can create base images for Kafka with different Java distributions for example. Which is very useful for testing and performance tuning.

Imagine you work at a large org and you want to control all images used for CI/CD workers. Instead of scattering it across different Dockerfiles and scripts (Java, NodeJS, python, etc) you can just use a single tool. At least it was why I built it in the first place.

mrbluecoat 3 months ago

I'm similarly curious why not just use Alpine or Void rootfs if container size is important?

[-]

akrylov 3 months ago

For the same reason hyperscalers build and maintain their own distro’s and base images – to have complete control over supply chain.

mathfailure 3 months ago

If the idea was to merge different layers - why not do something like this instead?

FROM your_image as initial

FROM scratch

COPY --from=initial / /

[-]

gkfasdfasdf 3 months ago

I guess one advantage of the author's approach is that any apt-get's etc done in building the initial image can reuse the host package cache.

anotherhue 3 months ago

If tickles your fancy may I also suggest trying Nix to build docker images?

Personally I've soured on the Dockerfile approach as it feels like we're just shuffling bytes around rather than composing something.

https://nix.dev/tutorials/nixos/building-and-running-docker-...

[-]

numbsafari 3 months ago

I have completely soured on Dockerfiles. I view them as anathema.

The supposed "caching" of layers really doesn't work in practice unless you add a bunch of other infrastructure and third-party tooling to your build process. Getting truly incremental and reproducible layers into your build process is non-trivial, and the Dockerfile approach fails to take advantage of that work once you've done it.

[-]

onedognight 3 months ago

You need to start with the right base. Here’s a container-first 100%-reproducible from-scratch base to build on.

[0] https://stagex.tools/

sepositus 3 months ago

A surprising downside to Nix containers is that a majority of packages are not optimized for containers. For example, trying adding a dependency to `git` and see how big the container grows. Granted, the good packages (like git) allow customization, but it requires really digging into the code. Some packages just straight up ship with a ton of bloat and the only thing you can do is basically fork and maintain it yourself.

[-]

max-privatevoid 3 months ago

It's a problem of nixpkgs. It would be cool to have an Alpine-like alternative package set focused on minimal package size.

[-]

pxc 3 months ago

There is, isn't there? That's what `pkgsStatic` in Nixpkgs is. Statically compiled packages with small closures built with musl, just like Alpine

Rucadi 3 months ago

you could try to statically link them if the package support it, it does so by using musl

nix build github:NixOS/nixpkgs#pkgsStatic.git

return the pacakge as:

ls -lah git

-r-xr-xr-x 1 rucadi rucadi 5.1M Jan 1 1970 git

ldd git

$ not a dynamic executable

So you don't really need to really grow the container

[-]

sepositus 3 months ago

Yeah, it's a problem on a package-per-package basis. My point isn't how to solve the git problem but that the experience can vary wildly depending on the package. It can be surprising and often comes at the expense of time trying to navigate the insanity that is nixpkgs :)

akrylov 3 months ago

Nix is cool, but with Nix one needs to know Nix. Personally, I prefer just using scripting languages. LLM's made code cheaper, but debugging become expensive.