Show HN: Generator SFT and DPO datasets for tool-calling LoRA fine-tuning

(nothumanallowed.com)

2 points | by senza1dio 11 hours ago ago

2 comments

I built a tool that generates deterministic SFT + DPO datasets for tool-calling LoRA fine-tuning (no LLM needed)

I was tired of hand-writing JSONL for my Qwen fine-tunes, so I built DataForge. It's a Python framework that generates structured training data from tool schemas — completely deterministic, no API calls needed.

What it does:

You define tool schemas (JSON) + data pools → it generates SFT conversations with tool calls DPO preference pairs from contrastive ranking Anti-template explosion detection (Bloom filter + trigram analysis) Quality gates (configurable thresholds, not vibes) Streaming generation, constant RAM — tested up to 100K examples Output: OpenAI/ShareGPT/ChatML format, ready for trl or axolotl Two working examples included (restaurant assistant, customer support) — ~600 SFT + 60 DPO each, runnable out of the box.

pip install -e . → dataforge generate --config config.yaml → dataset ready.

Repo: https://github.com/adoslabsproject-gif/dataforge

https://nothumanallowed.com/datasets

senza1dio 11 hours ago

[dead]