With all of the talk about "LLMs are non deterministic" I am trying to build my own tool (at this exact moment, since I seem to have gotten really good at this) which does some of what these OpenClaw/Clawdbots do for people, but instead of having these LLMs/agents handle everything I am trying to do as much of it as pure deterministic code as possible. I test things once, like daily automations, and they will repeat that way the next scheduled date/time. Also, anytime AI models are used, at first I am approving/analyzing a docker/podman/microVM container to make sure it is only EXACTLY what it needs for a job, and then that is saved for future. I approve it once, and then it is there for future use.
Also I happen to know a few useful things, from a year + of working with all of these models, tools, daily for usually at least 8..12... 15 hours a day (I cannot stop, I love it, and I am good at it). I just noticed someone finally posted what I figured out a year ago - when you do need the help of a LLM, and you need it/want it to be as safe and as deterministic as it can be - Use the biggest, best, smartest models for the "brains" but for ALL the agentic stuff, the tool use, the MCP servers, etc, use a smaller model. Current best models for that - MinimaxM2.5, Gemini 3.1 Flash (use pro for planning, fixing, problem solving, all the difficult stuff, but definitely use the smaller Flash for the agentic tool use) Its weird that I have known this for so long and it has barely caught on yet.. I did have some AI guides that went viral on HN months back wuu73. org/aiguide or /aicp (free tool, well loved) I should write more about all this stuff i've figured out so I am not dumping paragraphs on so many posts over and over like this.. if my tool ends up useful I put it on my site somewhere in the AI tools sections if you want to put your email in the RSS thing, I will try to put anything useful on there/update RSS
The real challenge isn't extraction but maintaining context across thousands of threads — most embedding approaches lose the temporal and conversational structure that makes email searchable. This covers it well: https://www.youtube.com/watch?v=wuBOrL3WGwI
... also, instead of just using all this stuff out there, like all the OpenClaw stuff, memory systems, what I am doing so that I understand it is just create my own from scratch I have not even read about how that one works, I spin up 10x copilot -p "idea", in the new automate mode they just added, in Docker container, 5x gemini cli in pro 3.1 then 5x flash 3.1, gpt 5.4 in there somewhere along with 5.3 codex, o3, 04-mini (crappy agentic models, BUT, that is okay because often that means they have some more intelligence that the more agentic models lack) scan those with my brain then run all the passing ones into a Best Of N scenario to compile a best/better etc. I have a whole system that works well - certain things will cause all these models to respond differntly, while other things will get them to mostly respond the same - which means something useful.
With all of the talk about "LLMs are non deterministic" I am trying to build my own tool (at this exact moment, since I seem to have gotten really good at this) which does some of what these OpenClaw/Clawdbots do for people, but instead of having these LLMs/agents handle everything I am trying to do as much of it as pure deterministic code as possible. I test things once, like daily automations, and they will repeat that way the next scheduled date/time. Also, anytime AI models are used, at first I am approving/analyzing a docker/podman/microVM container to make sure it is only EXACTLY what it needs for a job, and then that is saved for future. I approve it once, and then it is there for future use.
Also I happen to know a few useful things, from a year + of working with all of these models, tools, daily for usually at least 8..12... 15 hours a day (I cannot stop, I love it, and I am good at it). I just noticed someone finally posted what I figured out a year ago - when you do need the help of a LLM, and you need it/want it to be as safe and as deterministic as it can be - Use the biggest, best, smartest models for the "brains" but for ALL the agentic stuff, the tool use, the MCP servers, etc, use a smaller model. Current best models for that - MinimaxM2.5, Gemini 3.1 Flash (use pro for planning, fixing, problem solving, all the difficult stuff, but definitely use the smaller Flash for the agentic tool use) Its weird that I have known this for so long and it has barely caught on yet.. I did have some AI guides that went viral on HN months back wuu73. org/aiguide or /aicp (free tool, well loved) I should write more about all this stuff i've figured out so I am not dumping paragraphs on so many posts over and over like this.. if my tool ends up useful I put it on my site somewhere in the AI tools sections if you want to put your email in the RSS thing, I will try to put anything useful on there/update RSS
The real challenge isn't extraction but maintaining context across thousands of threads — most embedding approaches lose the temporal and conversational structure that makes email searchable. This covers it well: https://www.youtube.com/watch?v=wuBOrL3WGwI
... also, instead of just using all this stuff out there, like all the OpenClaw stuff, memory systems, what I am doing so that I understand it is just create my own from scratch I have not even read about how that one works, I spin up 10x copilot -p "idea", in the new automate mode they just added, in Docker container, 5x gemini cli in pro 3.1 then 5x flash 3.1, gpt 5.4 in there somewhere along with 5.3 codex, o3, 04-mini (crappy agentic models, BUT, that is okay because often that means they have some more intelligence that the more agentic models lack) scan those with my brain then run all the passing ones into a Best Of N scenario to compile a best/better etc. I have a whole system that works well - certain things will cause all these models to respond differntly, while other things will get them to mostly respond the same - which means something useful.