Show HN: Spec27 – Spec-driven validation for AI agents

(spec27.ai)

13 points | by njyx 7 hours ago ago

14 comments

Aniloid2 5 hours ago

Hey, I’m Brian from Research at Spec27. I’ve been working on some of the adversarial robustness techniques in the backend and am currently working on the multi-turn extension. I’d be happy to talk about what I’ve learned and hear any suggestions!

eloycoto 6 hours ago

I really like the judge from here: https://docs.spec27.ai/docs/guides/judges

I didn't see any example of the full flow, do you have anything that I can see/explore?

[-]

njyx 6 hours ago

Thanks for the feedback and question @eloycoto - there's a Loom video here: https://www.loom.com/share/727528de450a48d29a2ac20b279e26fc, and in the system itself, you can grab an example project from the registry.

There are out of the box judges and then you can customize them for each spec if you are testing something specific.

[-]

eloycoto 6 hours ago

ohh crazy good! I'll try this weekend and I'll keep you posted. Thanks!

_mikz 7 hours ago

Hey! Michal from the engineering team behind here. There are some painful experiences from the journey - async in Django, background processing in Python, scaling agent workflows with growing codebase. Happy to talk!

[-]

njyx 6 hours ago

Also, Github CLI budgets exploding :-)

jovanca_ 6 hours ago

Hi! Jovanca from Spec27 team here. We started building this because agent safety/validation still feels pretty undercooked in practice. Interested in how people here think about it :D

chesh 6 hours ago

I get so mad when responses from chat agents hallucinate. If this can rebuild trust in the results I will give Spec27 a try

[-]

njyx 5 hours ago

Assuming you know when they hallucinate?

owlyvision 4 minutes ago

[flagged]

neowalter 5 hours ago

[dead]

potter098 6 hours ago

[flagged]

5 hours ago

[deleted]

OlaDagil 5 hours ago

[dead]