Story of How Im Running an Unlimited $6/Month AI Provider on 4x RTX 3090s

7 points | by yolo-auto 2 days ago ago

3 comments

b--l 2 days ago

`qwen-35b-3a` is really garbage (I know well because it's what I run locally). What quant are you running? Would people really pay $6 a month for it even if unlimited?

That said, nice looking site and wish you the best for getting it off the ground.

[-]

yolo-auto a day ago

So on mi 300x we run FP8 and that was original plan.

We are doing some weird q4 on 3090s now that is surprisingly good (for qwen) .

will people pay for it? Yeah, sometimes. Some dude just burned 80mil tokens in 24 hr and posted it to our discord. I'd say he's getting 600 pennies worth.

I think the best part is that no one is watching what your doing, you don't worry, and it's fun.

Also just as divided as people are to whether it's a good deal or not, people are divided on liking/hating the name and brand.

hyde0395 20 hours ago

Interesting! i've never just used claude code and coded with local model before. is it smooth with a local model?