I took a different approach where I used PreToolUse hooks that screen bash commands before execution (blocking pushes to main, flagging destructive operations). It is less comprehensive than OS-level containment but lighter weight and catches the most common risks without giving up the approval flow entirely.
Honestly, the barrage of permissions requests is a real bottleneck for me still, so I am intrigued by what you have built.
Interesting that you formally verified the setup ordering. What did the three bugs look like?
I'd be interested to learn what you think when you try hazmat.
Running TLA+ via coding agents became a productivity hack for me to quickly catch invariant breaks, especially in concurrency-sensitive settings. LLM-based coding agents improved to the point of trivializing the process of encoding and running these models.
For this project, I was really concerned with the correctness of the init/rollback logic since it deals with users and groups, and it's easy to make life uncomfortable if you implement it incorrectly. So I decided to formalize migration paths for configurations I set up with 'hazmat init', and revert with 'hazmat rollback', which effectively returns the system to the state it was before running hazmat. I had a workspace concept in 0.1.0, which was removed in 0.2.0, and I wanted to check that I don't leave the system in a weird state. Then I reused and updated this migration model for 0.2.0 -> 0.3.0 and so on.
Also, I found multiple issues with seatbelt policy generation, a flaw that would allow a launch without a firewall, nasty cloud restore bug for s3 snapshots.
I took a different approach where I used PreToolUse hooks that screen bash commands before execution (blocking pushes to main, flagging destructive operations). It is less comprehensive than OS-level containment but lighter weight and catches the most common risks without giving up the approval flow entirely.
Honestly, the barrage of permissions requests is a real bottleneck for me still, so I am intrigued by what you have built.
Interesting that you formally verified the setup ordering. What did the three bugs look like?
I'd be interested to learn what you think when you try hazmat.
Running TLA+ via coding agents became a productivity hack for me to quickly catch invariant breaks, especially in concurrency-sensitive settings. LLM-based coding agents improved to the point of trivializing the process of encoding and running these models.
For this project, I was really concerned with the correctness of the init/rollback logic since it deals with users and groups, and it's easy to make life uncomfortable if you implement it incorrectly. So I decided to formalize migration paths for configurations I set up with 'hazmat init', and revert with 'hazmat rollback', which effectively returns the system to the state it was before running hazmat. I had a workspace concept in 0.1.0, which was removed in 0.2.0, and I wanted to check that I don't leave the system in a weird state. Then I reused and updated this migration model for 0.2.0 -> 0.3.0 and so on.
Also, I found multiple issues with seatbelt policy generation, a flaw that would allow a launch without a firewall, nasty cloud restore bug for s3 snapshots.
[dead]