Developing with Docker

(danielquinn.org)

67 points | by bruh2 4 hours ago ago

63 comments

  • d_watt 3 hours ago

    I don't think I agree with this. Docker is an amazing tool, I've used it for everything I've done in the last 7 years, but this is not how I'd approach it.

    1. I think the idea of local-equal-to-prod is noble, and getting them as close as possible should be the goal, but is not possible. In the example, they're using a dockerized postgres, prod is probably a managed DB service. They're using docker compose, prod is likely ECS/K8S/DO/some other service that uses the image (with more complicated service definitions). Local is probably some VM linux kernel, prod is some other kernel. Your local dev is using mounted code, prod is probably baked in code. Maybe local is ARM64, and prod is AMD64.

    I say this not because I want to take away from the idea of matching dev and prod as much as possible, but to highlight they're inherently going to be very different. So deploying your code with linters, or in debug mode, and getting slower container start times at best, worse production performance at worse - just to pretend envs which are wildly different aren't different seems silly. Moreover if you test in CI, you're much more likely to get to a prod-like infra than a laptop.

    2. Cost will also prohibit this. Do you have your APM service running on every dev node, are you paying for that for all the developer machines for no benefit so things are the same. If you're integrating with salesforce, do you pay for a sandbox for every dev so things are the same. Again, keeping things as similar as possible should be a critical goal, but their are cost realities that again make that impossible to be perfect.

    3. In my experience if you actually want to achieve this, you need a remote dev setup. Have your code deployed in K8S / ECS / whatever with remote dev tooling in place. That way your DNS discovery is the same, kernels are the same, etc. Sometimes this is worth it, sometimes it isn't.

    I don't want to be negative, but if one of my engineers came to me saying they wanted to deploy images built from their machine, with all the dev niceties enabled, to go to prod, rather than proper CI/CD of prod optimized images, I'd have a hard time being sold on that.

    • vhiremath4 3 hours ago

      After going through a bunch of evolutions using Docker as co-founder/engineer #1 at a startup to > 100 engineers, hard agree on this take.

      One other reason to not overbloat your images (besides physical storage cost and perf) is security considerations. If you find yourself in a place where you need to meet enterprise security standards, keeping more dependencies in your image and linked to your app code widens your risk vector for vulnerabilities.

    • JohnMakin 3 hours ago

      > I don't want to be negative, but if one of my engineers came to me saying they wanted to deploy images built from their machine, with all the dev niceties enabled, to go to prod, rather than proper CI/CD of prod optimized images, I'd have a hard time being sold on that.

      ditto, "worked on local" is a meme for a reason.

      • timbotron 2 hours ago

        but "works on my machine" is exactly the problem docker solves -- if you still have those issues then you're not building your images right

        • JohnMakin an hour ago

          That's why there is the meme. Docker doesn't always solve this, or even often on its own. You can't just "build image to docker" and expect things to go okay and pretend you can be agnostic about its prod environment (you kind of can, but usually requires an ops team supporting whatever assumptions your image made).

          It's been addressed in other comments but you have:

          - differences in architectures, CPU, memory

          - if the docker image has a volume attached, unless local perfectly represents prod, you're going to have issues

          - networking and firewall rules can be drastically different than a production environment and the assumptions made there

          - differences in RBAC/IAM/etc. between local and prod could go on and on.

          In reality, this is a nice idea, in practicality, it almost never works 1:1. The common refrain is "well just make the local/dev/sandbox exactly match prod" and my point is this is often unrealistic to the point it cannot/won't happen. If you can do it, good for you, I just personally have never seen this work as simply the author describes it in a system of any kind of complexity.

    • alexkbog 2 hours ago

      Whereas this was true (for a long time), making an argument that because remote and local are so “inherently” different that one shouldn’t strive for this parity is silly, especially considering the differences outlined are pretty easily solvable by k8s to local parity.

      Whereas it’s still clunky to dev with tools like skaffold and minikube, I strongly believe they are the future. We have essentially eliminated deployment bugs using skaffold for local dev and deployment. Everything is caught locally, on a dev machine or in CI, as it should be.

    • timbotron 2 hours ago

      "widely different" seems like a stretch e.g. ECS is pretty directly translatable to docker compose, and if you do cross-platform builds with buildx then I don't see why doing the building locally or on a cloud service matters much.

  • threemux 3 hours ago

    This demonstrates the most pernicious thing about Docker: it is now easier than ever for someone to design a Rube Goldberg machine and then neatly sweep it under the rug.

    When you see the dev environment setup described, the knee jerk reaction should be to simplify it, not to automate the running of 30 disparate commands. Then you can much more easily run it in production, instead of boxing up the mess and waiting until you actually have to debug it.

    • stuaxo 2 hours ago

      Indeed - keeping stuff working inside AND outside docker, is probably a good way of keeping a project honest.

    • hluska 2 hours ago

      There’s really nothing complicated in the docker-compose listed. It’s simple, uses some very simple commands that everyone should know and sets environment variables at build.

      We also have code reviews. They’re helpful with containers because sometimes people do silly things when they can. But that’s why we have code reviews in the first place.

    • d_watt 3 hours ago

      I mean, having a web, worker, cache, db, and search as separate things doesn't seem that crazy?

      While of course you can often get away with just a web with sqlite, pretty much any company of scale I've been at maintain those all as separate things, with separate search as the most optional.

      What do you feel about the setup described seems overly complicated?

  • mpettitt 3 hours ago

    Including developer tooling in a Docker image is missing one of the really useful things about Docker: not needing all that stuff. By using a multi-stage build, you can do all the slow dev stuff at image build time, then only include the output in the image, and that includes things like building a library which wants a different set of conditions to build than your application wants to run.

    It also adds an additional level of risk - if your image is compromised, but all that is running in it is your app, oh well. If it's compromised, and it's able to call out to other parts of your stack (yes, some of this is down to the specific deployment process), that's much worse.

    • Joe_Cool 2 hours ago

      That's a good idea. I usually have a 'myproject-debugtools' container that just operates on the same volume and maybe even shares a network with the 'myproject-prod' container. Just set it to `--restart no` and even when someone forgets to shut it down it'll be gone on reboot. That way all the non-prod stuff isn't even in the image at any point.

      Or if that is too much work just have a 'myproject-dev' image/tag if you need to debug a live environment.

  • remram 3 hours ago

    Docker is a deployment mechanism. This means publishing Docker images is a deployment activity not a development one.

    I don't think software developers should publish Docker images at all [1]. This is a huge impedance mismatch with serious security implications. In particular, your Docker image needs a regular release cadence that is different from your software releases.

    Including a Dockerfile is fine, they allow the person doing the deployment to customize/rebuild the image as needed (and help with development and testing too).

    [1]: Though I'm not saying you can't be both a developer and sysadmin in your organization. Are you?

    • tomberek 2 hours ago

      Agreed. Packaging is different than deployment. Devs should return to the art of packaging, such that their software can be then deployed into containers, VMs, micro VMs, whatever. That is what packaging allows, re-use.

      This is the sort of behavior Nix encourages (disclaimer: I work at https://flox.dev , using Nix as our baseline tech). Docker as both a packaging and deployment format can carry a bit of weight, but can quickly get out of hand.

  • c-hendricks 3 hours ago

    I love Docker (more so the idea of containers). Use it almost everywhere: self hosting services, at work everything is deployed as a docker container.

    Except local development. Absolutely hate the "oh need to add a dependency, gotta rebuild everything" flow.

    I do use it if the project I'm developing against needs a DB/redis/etc, but I don't think there's a chance I'm going back to using it for local development.

    In fact, at work, the project where we do use docker in development actually causes the most headaches getting up and running.

    I use a combination of CPU architectures, so the idea of running _exactly_ what's in production when developing is already out the window.

    • f1shy 3 hours ago

      I hate container and Docker in ANY use-case where there is an alternative that is same, or even "a little bit" more involved.

      I reserve Docker and Containers for that use case where I really would have headaches if it is no there, and have not still found such a case in all the works I've done.

      • zelphirkalt 2 hours ago

        One thing Docker helps with is reproducibility. If you write your images properly (not many people do) then you can have exact same conditions for every time you run tests. If you keep databases on the host machine, instead of containers, you will have to have some cleanup steps and automate somehow, that they are always run. Otherwise you risk shaky test results or even false positives/negatives. That might be fine, if the CI runs the tests reliably as well though.

      • mirekrusin 2 hours ago

        Containers are great interface with other teams – ie black box their services, don't care how their things are running, just communicate which envs to use to make it work according to comms spec.

      • chambored 3 hours ago

        Everyone has different thresholds.

    • candiddevmike 3 hours ago

      Docker for local development is only useful for running services like Postgresql and Redis, or doing hot reloads using something like vite or air. The development in a box paradigm is really difficult to maintain, much prefer direnv or nix.

      • c-hendricks 2 hours ago

        Yup, at work we've been doing "dev machine bootstrap script installs version managers (nvm/mise/direnv/etc), projects use those" and have been experimenting with direnv + nix.

    • Topgamer7 3 hours ago

      At least for python, I typically just add another RUN statement instead of changing a file used in a layer up the layer stack. That way the change is fast.

      Then when I need to commit, I'll update the requirements file or whatever would cause all the layers to rebuild. And CI can rebuild the whole thing while I move to something else.

      It is a bit of a pain, but the other benefits of containers are probably worth the trade off.

  • jgauth 3 hours ago

    > What if running the linters was as easy as: $ docker compose exec web /scripts/run-linters

    This seems to ignore the fact that I also run linters in my IDE to get immediate feedback as I’m writing code. As far as I know there’s no way to combine these two approaches. Currently I’m just careful to make sure my local ruff version matches the one used in CI.

    It may be possible with VS Code dev containers, but last time I looked at those I was turned off by the complexity.

  • MzHN 36 minutes ago

    It seems I do the opposite of many commenters.

    I do not run Docker in production at all but I also do not develop any serious projects outside of Docker.

    Installed on the host machine are only VSCode, Docker and Git.

    You work on a project by cloning it, opening in VSCode and clicking on "Reopen in container".

    This will spin up generic services like databases and then the actual app container as a VSCode Remote Container, with all the development tooling inside the container.

    Does not matter if tooling changes between projects, any project can be worked on with a single click of "Reopen in container".

    Host machine stays clean.

  • seanwilson an hour ago

    > What if running the linters was as easy as:

    > $ docker compose exec web /scripts/run-linters

    What do people do for making these kinds of commands less verbose and easy to remember?

    We've done things like use Makefile with the above behind `make lint`. However, chaining together shortcuts like "make format lint test" gets slow because "docker compose" for each one takes time to start up.

    If you instead run the Makefile while you have a terminal open inside one of the Docker containers, that can be faster as you can skip the "docker compose" step, but then not every Makefile target will be runnable inside a Docker container (like a target to rebuild the Docker image), so you have to awkwardly jump between terminals that are inside/outside the Docker container for different tasks? Any tricks here?

  • alphapug68 2 hours ago

    I have experimented with a local setup in our team.

    With the new Docker compose watch functionality I think it works well.

    https://docs.docker.com/compose/how-tos/file-watch/

    For me this has negated the need for manual mounting.

    I combine the above with dotnet watch —-non-interactive in the dockerfile for dotnet and a simple ng serve in our Angular apps.

    If new dependencies are added via npm install you can set it so that Docker watch will auto rebuild your container. So it gets around that issue too.

    I have a .bat file in our repo that runs the Docker compose action to start up all the needed services and has some powershell to wait until the main UI service is up. When it’s up it auto opens the web browser.

    I have a docker container that uses Dozzle (https://dozzle.dev/) for log monitoring across the various services. It can also stop/restart containers if needed.

    I also have a container that can be ran to perform a database restore from an external Postgres DB into a local Postgres Docker container.

    I will say that dotnet debugging is clunky. You can attach to the Docker container in Visual Studio but if a hot reload has happened you can’t debug again until the app has restarted. For dotnet if I need to do some intensive debugging I tend to spin it up outside Docker for this reason.

    • hiatus 2 hours ago

      > This negates the need for manual mounting.

      The documentation you linked says it is a complement not a replacement.

      > Compose supports sharing a host directory inside service containers. Watch mode does not replace this functionality but exists as a companion specifically suited to developing in containers.

      • alphapug68 2 hours ago

        I haven’t had the need to mount anything manually for local development. It entirely replaces it for my needs. This is for a stack using some Postgres databases, dotnet, react, node.js and angular.

  • stuaxo 2 hours ago

    It sounds good, but Docker is also good for all sorts of other patterns too.

    For instance: Docker is great for local development that never touches prod.

    Docker is great for hybrid local dev, where you run some services in docker but not others.

    If your desktop is Linux and you are building python based web services, then running python in a virtualenv is often much more responsive than having to rebuild some docker thing.

    • johannes1234321 2 hours ago

      You don't have to rebuild docker things that often. One can mount the local source directory into a container, then one has a relatively well defined (docker containers aren't reproducible builds themselves) runtimes environment (python version, global packages etc.) while editing happens outside the container. Especially useful if one switches between versions etc. regularly.

  • jmathai 3 hours ago

    I’ve been writing software for 20+ years. Sometimes I feel like I’m the only one who hasn’t had the problems Docker solves.

    • stackskipton 3 hours ago

      Depends on the language and deployment method. Windows anything on .Net Framework. Sure, why use Docker?

      C++ CSV parser, yep, Docker generally doesn't make sense since it's ship the executable and done.

      However, Python/Java/.Net where you can't be sure of run time? Container. Node with a ton of random packages? Container

      Need to manage a big fleet of software? Containers enable a ton of container orchestration engines.

  • davedx 3 hours ago

    But wait how do you write code if the services are all in docker like that?

    What about my strongly typed monorepo?

    (I prefer a variation of this where all artefacts like databases are in docker compose, but my monorepo services run outside docker)

  • lbreakjai an hour ago

    At work, we took the radically different approach of not having such a thing as a local environment. It won't necessarily work for every tech stack, but we mostly use lambdas, RDS, SQS, dynamoDB, kafka, and S3, so it's trivial to spin up and tear down the stack as we go. Essentially, instead of trying to ship the local machine to prod, we bring prod to the local machine.

    It's a breath of fresh air not having to maintain a separate local environment.

  • cornstalks 3 hours ago

    The one thing I feel like is missing in guides like this is key management. I don't like the idea of putting secret keys in my compose.yaml and I would prefer to use something more... controllable? Auditable? The thing is, I don't really know, because this isn't the kind of stuff I work on for $dayjob. But I can't help but feel like there's something missing with key management, and for a noob like me I don't know how to fit it into the larger puzzle.

    • vhiremath4 3 hours ago

      You can inject keys into the running container by passing them as environment variables during the docker run command, ideally supplied via a secrets manager.

      • cornstalks 3 hours ago

        I understand that at a high level, but the implementation is where I get lost and where I'd love an article like this to tell me how to do it and how to deploy securely vs develop locally. Most of the guides I've seen involving a secrets manager assume you're very comfortable with Docker, but I'm still trying to figure it out and need some hand holding like this article does.

        • d_watt 3 hours ago

          I think this is mostly because that's out of scope of responsibility of docker, and docker compose (for the most part) is only a local dev tool without prod concerns.

          For deploying docker containers to production, and how to manage secrets, you'd need to look to that container orchestrator's recommendations. EG K8S secrets. It doesn't make too much sense to put an example of how to use production secrets in a docker guide, because those belong in a K8S/GKS/EKS/DO etc tutorial.

          Docker's "interface" is how to accept env variables, it's other parts of the system that need to set those variables.

      • ggregoire 3 hours ago

        You can also pass an entire .env file with the --env-file option.

        • stuaxo 2 hours ago

          I wish there was some secrets manager that would give me a per-project env file in somewhere ephemeral like /run (bonus points for it disappearing when the computer is locked).

          Keeping a .env file around still is still a vulnerability if a device goes missing.

        • chambored 3 hours ago

          And in the env_file attribute in your compose yaml

    • remram 3 hours ago

      You can mount a file with the secret in it. This is often recommended anyway because environment variables are inherited by linked libraries and subprocesses, making it too easy for some third-party code to leak.

  • kitd 3 hours ago

        Developer Tooling
    
        This is where I tend to run into the most pushback on this pattern but it's also the 
        part that can greatly reduce headaches. Are you ready? Your immutable image includes 
        everything you need for development: linters, tests, and debugging modules. I will 
        sometimes even include a few useful system tools like netcat or ping, as well as a 
        fancy prompt.
    
        None of these things are necessary for production. They are at best, image bloat, 
        adding anywhere from 100 to 200 MB of useless code to your image that's never used in 
        the wild. Why then, would we want to include it?
    
    Sorry, but this is dangerous advice. This won't pass most serious security audits and to use these tools, you'd likely need to be running as root.

    Much better is to strip your immutable images to the bare minumum and instantiate a debug sidercar, eg [1], if you need to peer inside.

    [1] - https://github.com/mhoyer/docker-swiss-army-knife

    • chambored 3 hours ago

      I agree with this. For multi-OS dev teams, I’ve set up separate compose files or Dockerfiles for dev and prod. I kept them as similar as possible while optimizing the images for prod and including the niceties for dev.

  • brendanjbond 3 hours ago

    This is all very good and true, but as usual the devil is found in the details. For instance, my company sells Docker images that depend on a very old and recently unmaintained binary. Over the years, I've found issues with that binary that make it very hard to be sure issues are completely reproducible from system to system (or, as the article suggests, from local to production). Sometimes, it's as simple as a newer base image updating a core dependency (e.g. Alpine updating musl), but other times it seems like nothing changes but the host machine, and diagnosing kernel-level issues - say, your local Mac OS' LinuxKit kernel versus your production Amazon Linux or Ubuntu, and don't forget x86 emulation! - make "test what you develop and deploy what you test" occasionally very daunting.

    • jpgleeson 3 hours ago

      These are the sort of issues that Nix <https://nixos.org/> solves quite well. Pinning dependencies to specific versions so the only time dependencies change is when you explicitly do it - and the only packages present in your images are ones you specifically request, or dependencies of those packages. It also gives you local dev environments using the ~same dependencies by typing `nix develop`.

      Once you get past the bear that is the language, it's a great tool.

      • chambored 3 hours ago

        I found setting up nix shells to be more time consuming than docker setups. Nixpkgs can require additional digging to find the correct dependencies that just work on other distributions. That being said, I’m a huge fan of NixOS, but I haven’t seen it as a replacement for docker for reproducible dev environments yet.

    • yjftsjthsd-h 3 hours ago

      I'll grant that the kernel version+config shifting is a pain point, but I'd expect that containers help with the rest of it (userspace)? Yes, obviously changing the base image is a potential breaking change, but with containers you package up the ancient binary and the base image and any dependencies into a single unit, and then you can test that that whole unit works (including "did that last musl upgrade break the thing?"), and if it passes then you ship the whole image out to your users safe in the knowledge that the application will only be exposed to the libraries you tested it against and no newer versions.

    • stackskipton 3 hours ago

      Sounds like yall are doing a poor job building the container. It's one thing to rely on built in musl/glibc if it's modern software. However, if you are dragging technical debt, all those dependencies should be hard locked to the proper version.

  • trevor-e 3 hours ago

    As someone new to Docker, the thing that is never answered (including in this post, the irony of the title...) is what an actual dev workflow looks like.

    Let's say I'm working on a web app. I start my container, awesome. Now I make a code change. What do I do? Since the code is copied in the container, do I have to stop, rebuild the image, and start again? Or does it automagically rebuild like most frameworks support? And how about debugging, are there any hoops to jump through connecting a debugger to a container? And what about the filesystem. If I need to inspect the output of something, say a log file in the container, is that easy to do? I've tried this before and got pretty lost trying to access the filesystem.

    None of these questions are obvious to Docker beginners like myself. We get the whole "consistent environment" benefits of Docker, now talk about a practical workflow please. :)

    edit: thanks for the answers!

    • horsawlarway 3 hours ago

      Generally speaking - you can just mount the local directory in the container (this is particularly easy with docker compose) during development.

      https://docs.docker.com/engine/storage/bind-mounts/

      Then there's no need to rebuild or restart your container while developing.

      For debugging, it depends a bit on the tool - anything that expects to interact through a network port is very straightforward (just expose the port locally, and run the tool as usual). If it needs more than that, it can be more complex.

    • chc4 3 hours ago

      You can mount volumes in your Docker containers: what I usually do is mount my host source directory over the container's source directory, so that changes are immediately visible inside the container, and then hot reloading in e.g. Flask works exactly like you expect.

    • stackskipton 3 hours ago

      If you are using VSCode, you can run VSCode inside a container. Otherwise, most people run the code environment outside a container.

      Also, read 12 Factor, logging to a file is wrong. Logs go to standard out.

      • trevor-e 3 hours ago

        Thanks for the 12 Factor tip, makes sense.

  • JohnMakin 3 hours ago

    > At it's simplest, stuff like this: if ENVIRONMENT == "prod": do_something_only_production_does() ...shouldn't happen.

    This is a common refrain among people that IMO do not have a lot of experience in big, complex systems, especially ones running a lot of legacy code. Like, ideally, sure, but in reality making this possible almost always involves extra cost, time, and complexity, and what do you gain from that, really? It's a pretty concept, but not at all practical.

    • stuaxo an hour ago

      The Django equivilent is having different settings for prod vs dev, and for good reason: There are things I run in Dev, like the Django Debug Toolbar that I just wouldn't run in prod.

      • JohnMakin an hour ago

        Fintech is laden with stuff like this. Often test environments vary drastically from prod by necessity, or due to compliance and regulation, things need to be done very differently in a production environment.

  • revel 3 hours ago

    That last point about differences between dev, test and prod should be right at the top. It's rare to find teams that have set themselves up for success (for reasons I do not fully understand)

  • ewuhic 3 hours ago

    >but importantly, the image at each of these stages is exactly the same: reproducible at every turn

    This is wrong. Hadn't read any further.

    Regards,

    NixOS adept

    • marliechiller 2 hours ago

      Arise NixOS acolytes, arise!

      All jokes aside, to expand on this for others, you can fix a lot of dependencies but as soon as you start your dockerfile with a FROM, youre hiding a bunch of dependencies which you no longer control, behind that command. An example would be:

      FROM tiangolo/uvicorn-gunicorn-fastapi

      If tiangolo decides to update anything in the base image defined here, your build is now different to how it was before without really knowing

  • hluska 3 hours ago

    I find this topic very interesting. On one hand, I am sure that every person reading this has run into a ‘bug’ introduced because of a dev/prod mismatch. On the other hand, the only way to get a total match is to either pay an obscene amount of money or roll everything yourself (which will also have cost, much of which can never be recovered).

    As an example, I can build something, deploy it on my own and create a near one to one match. But that means building everything and never using a managed service. If the application interfaces with another tool like Salesforce, do we have multiple instances for every single developer?

    Or, do we roll our own CRM?

    Matching is great but in a managed world it’s very expensive.

  • daft_pink 3 hours ago

    Thank you. This is a great resource!