13 points | by fagnerbrack 2 hours ago ago
2 comments
2024 which is ancient history. This is not true anymore, the models now are trained to prevent abliteration by spreading out the refusal encoding
See https://arxiv.org/abs/2505.19056
That doesn't stop/prevent abliteration. The creator of XTC/DRY is also a chad who makes sure that you really can access the full model capabilities. Censorship is the devil.
https://github.com/p-e-w/heretic
2024 which is ancient history. This is not true anymore, the models now are trained to prevent abliteration by spreading out the refusal encoding
See https://arxiv.org/abs/2505.19056
That doesn't stop/prevent abliteration. The creator of XTC/DRY is also a chad who makes sure that you really can access the full model capabilities. Censorship is the devil.
https://github.com/p-e-w/heretic