I’ve tried QVQ before, one issue I always stumbled upon is that it entered an endless loop of reasoning. It felt wrong letting it run for too long, I didn’t want to over consume Alibabas gpus.
I’ll try QVQ-Max when I get home. It’s kinda fascinating how these models can interpret an image and actually understand what is in it and reason around it. So cool.
Not at all. GPT-4o is image output - this model (and previous Qwen release QvQ - https://simonwillison.net/2024/Dec/24/qvq/) are image input only with a "reasoning" chain of thought to help analyze the images.
I notice four downvotes so far for stating a fact that a debate exists. My comment above didn't even make a normative claim. For those who study AI risks, there is indeed a _debate_ about the pros and cons of open weights. The question of "what are the implications of open-weight models?" is not an ideological one; it is a logical and empirical one.
I'm not complaining; it is useful to get an aggregated signal. In a sense, I like the downvotes, because it means there are people I might be able to persuade.
So how do I make the case? Remember, I'm not even making an argument for one side or the other! My argument is simply: be curious. If appropriate, admit to yourself e.g. "you know, I haven't actually studied all sides of the issue yet; let me research and write down my thinking..."
Here's my claim: when it comes to AI and society, you gotta get out of your own head. You have to get out of the building. You might even have to get out of Silicon Valley. Go learn about arguments for and against open-weights models. You don't have to agree with them.
I think this is old news, but this model does better than llama 4 maverick on coding.
LLaMA 4 is pretty underwhelming across the board.
I wonder why are we getting these drops during the weekend. Is the AI race truly that heated?
Judging from my blog, I get much more engagement on the weekends.
> March 28, 2025
IIUC engineers in China only get one day off per week.
I guess a lot of people do their regular 9-5 through week and play with new stuff on the weekends. But also yes, it is truly that heated
IIUC engineers in China only get one day off per week. IDK if that's hyperbole or not.
I’ve tried QVQ before, one issue I always stumbled upon is that it entered an endless loop of reasoning. It felt wrong letting it run for too long, I didn’t want to over consume Alibabas gpus.
I’ll try QVQ-Max when I get home. It’s kinda fascinating how these models can interpret an image and actually understand what is in it and reason around it. So cool.
The about page doesn’t shed light on the composition of the core team nor their sources of incomes or funding. Am I overlooking something?
> We are a group of people with diverse talents and interests.
I think Qwen team is alibaba AI arm https://qwenlm.github.io/about/
Naming things is truly hard even for the smartest people.
I appreciate Sony and AI companies for not naming their wares Cake Super Pro or such.
Just going off the blog post, this seems like a multimodal LLM that uses thinking tokens. That’s pretty cool. Is this the first of its kind?
WHY
So how do I run it locally?
Isn't "thinking" in image mode basically what chatgpt 4o image generation do ?
Not at all. GPT-4o is image output - this model (and previous Qwen release QvQ - https://simonwillison.net/2024/Dec/24/qvq/) are image input only with a "reasoning" chain of thought to help analyze the images.
Unfortunately, no open weights this time :(
https://ollama.com/joefamous/QVQ-72B-Preview
Experimental research model with enhanced visual reasoning capabilities.
Supports context length of 128k.
Currently, the model only supports single-round dialogues and image outputs. It does not support video inputs.
Should be capable of images up to 12 MP.
>Last December, we launched QVQ-72B-Preview as an exploratory model, but it had many issues.
That's an earlier version released some months ago. They even acknowledge it.
The version they present in the blog post and you can run in their chat platform is not open or available to download.
The wisdom of open weights is hotly debated.
wisdom? I don't get what you meant with that. What is clear is that open weights benefits society as we can run it locally and privately.
Have you sought out alternative positions that you might be missing?
I notice four downvotes so far for stating a fact that a debate exists. My comment above didn't even make a normative claim. For those who study AI risks, there is indeed a _debate_ about the pros and cons of open weights. The question of "what are the implications of open-weight models?" is not an ideological one; it is a logical and empirical one.
I'm not complaining; it is useful to get an aggregated signal. In a sense, I like the downvotes, because it means there are people I might be able to persuade.
So how do I make the case? Remember, I'm not even making an argument for one side or the other! My argument is simply: be curious. If appropriate, admit to yourself e.g. "you know, I haven't actually studied all sides of the issue yet; let me research and write down my thinking..."
Here's my claim: when it comes to AI and society, you gotta get out of your own head. You have to get out of the building. You might even have to get out of Silicon Valley. Go learn about arguments for and against open-weights models. You don't have to agree with them.
Is there a good wisdom benchmark we can run on those weights? /s
[dead]