18 points | by lastdong 5 hours ago ago
1 comments
hmm... at Q4_K_M, stock-style quantization is retaining ~99–99.8% of BF16 accuracy, AutoRound pushes that to ~99.4–100.n% (??) the gap is roughly 0.1–0.7 percentage points
https://github.com/intel/auto-round/blob/main/docs/gguf_alg_...
hmm... at Q4_K_M, stock-style quantization is retaining ~99–99.8% of BF16 accuracy, AutoRound pushes that to ~99.4–100.n% (??) the gap is roughly 0.1–0.7 percentage points
https://github.com/intel/auto-round/blob/main/docs/gguf_alg_...