31 comments

  • KTibow 15 hours ago

    Without any changes, you can already use Codex with a remote or local API by setting base URL and key environment variables.

    • kingo55 14 hours ago

      Does it work for local though? It's my understanding this is still missing.

      • KTibow 14 hours ago

        If your favorite LLM inference program can run a Chat Completions API.

        • codingmoh 13 hours ago

          Thanks for bringing that up - it's exactly why I approached it this way from the start.

          Technically you can use the original Codex CLI with a local LLM - if your inference provider implements the OpenAI Chat Completions API, with function calling, etc. included.

          But based on what I had in mind - the idea that small models can be really useful if optimized for very specific use cases - I figured the current architecture of Codex CLI wasn't the best fit for that. So instead of forking it, I started from scratch.

          Here's the rough thinking behind it:

             1. You still have to manually set up and run your own inference server (e.g., with ollama, lmstudio, vllm, etc.).
             2. You need to ensure that the model you choose works well with Codex's pre-defined prompt setup and configuration.
             3. Prompting patterns for small open-source models (like phi-4-mini) often need to be very different - they don't generalize as well.
             4. The function calling format (or structured output) might not even be supported by your local inference provider.
          
          Codex CLI's implementation and prompts seem tailored for a specific class of hosted, large-scale models (e.g. GPT, Gemini, Grok). But if you want to get good results with small, local models, everything - prompting, reasoning chains, output structure - often needs to be different.

          So I built this with a few assumptions in mind:

             - Write the tool specifically to run _locally_ out of the box, no inference API server required.
             - Use model directly (currently for phi-4-mini via llama-cpp-python).
             - Optimize the prompt and execution logic _per model_ to get the best performance.
          
          Instead of forcing small models into a system meant for large, general-purpose APIs, I wanted to explore a local-first, model-specific alternative that's easy to install and extend — and free to run.
    • asadm 15 hours ago

      i think this was made before that PR was merged into codex.

      • KTibow 14 hours ago

        Good correction - while the SDK used has supported changing the API through environment variables for a long time, Codex only recently added Chat Completions support recently.

      • xiphias2 13 hours ago

        Maybe it was part of the reason that they accepted the PR. The fork would happen anyways if they don't allow any LLM.

        A bit like how Android came after iPhone with open source implementation.

  • submeta 2 hours ago

    Sounds great! Although I would prefer Claude Code to be open sourced as it’s a tool that works best for Vibe coding. Albeit expensive using Anthropic‘s models via API. There is an inofficial clone though („Anon Kode“), but it’s not legitimate.

  • fcap an hour ago

    Why forking and use open codex when the original OpenAI opened it for multiple models? Just trying to understand.

  • underlines 2 hours ago

    Don't forget https://ollama.com/library/deepcoder which ranks really well for its size

  • xyproto 15 hours ago

    This is very convenient and nice! But I could not get it to work with the best small models available for Ollama for programming, like https://ollama.com/MFDoom/deepseek-coder-v2-tool-calling for example.

    • codingmoh 14 hours ago

      Thanks so much!

      Was the model too big to run locally?

      That’s one of the reasons I went with phi-4-mini - surprisingly high quality for its size and speed. It handled multi-step reasoning, math, structured data extraction, and code pretty well, all on modest hardware. Phi-1.5 / Phi-2 (quantized versions) also run on raspberry pi as others have demonstrated.

      • xyproto 4 hours ago

        The models work fine with "ollama run" locally.

        When trying out "phi4" locally with:

        open-codex --provider ollama --full-auto --project-doc README.md --model phi4:latest

        I get this error:

              OpenAI rejected the request. Error details: Status: 400, Code: unknown, Type: api_error, Message: 400
            registry.ollama.ai/library/phi4:latest does not support tools. Please verify your settings and try again.
    • smcleod 14 hours ago

      That's a really old model now. Even the old Qwen 2.5 coder 32b model is better than DSv2

  • vincent0405 7 hours ago

    Cool project! It's awesome to see someone taking on the challenge of a fully local Codex alternative.

  • strangescript 16 hours ago

    curious why you went with Phi as the default models, that seems a bit unusual compared to current trends

    • codingmoh 15 hours ago

      I went with Phi as the default model because, after some testing, I was honestly surprised by how high the quality was relative to its size and speed. The responses felt better in some reasoning tasks-but were running on way less hardware.

      What really convinced me, though, was the focus on the kinds of tasks I actually care about: multi-step reasoning, math, structured data extraction, and code understanding.There’s a great Microsoft paper on this: "Textbooks Are All You Need" and solid follow-ups with Phi‑2 and Phi‑3.

    • jasonjmcghee 15 hours ago

      agreed - thought the qwen2.5-coder was kind of standard non-reasoning small line of coding models right now

      • codingmoh 15 hours ago

        I saw pretty good reasoning quality with phi-4-mini. But alright - I’ll still run some tests with qwen2.5-coder and plan to add support for it next. Would be great to compare them side by side in practical shell tasks. Thanks so much for the pointer!

  • user_4028b09 8 hours ago

    Great work making Codex easily accessible with open-source LLMs – really excited to try it!

  • siva7 15 hours ago

    At least it can't be worse than the original codex using o4-mini.

    • codingmoh 15 hours ago

      fair jab - haha; if we’re gonna go small, might as well go fully local and open. At least with phi-4-mini you don’t need an API key, and you can tweak/replace the model easily

  • ai-christianson 13 hours ago

    > So I rewrote the whole thing from scratch using Python

    So this isn't really codex then?

  • shmoogy 14 hours ago

    Codex merged in to allow multiple providers today - https://github.com/openai/codex/pull/247

    • bravura 12 hours ago

      Sorry, does that mean I can use anthropic and gemini with codex? And switch during the session?