I used Claude Code to get a second opinion on my MRI

(antoine.fi)

119 points | by engmarketer 2 hours ago ago

167 comments

  • sxg 2 hours ago

    I'm a radiologist but can't really weigh in without seeing the full 3D MRI dataset. Regarding this point:

    > They performed shockwave therapy on my shoulder even though a recent clinical practice guideline says clinicians should not use or recommend shockwave therapy for rotator-cuff tendinopathy without calcification; I was told during ultrasound that there was no calcification.

    Ultrasound isn't a great way to assess for calcification. It'll find large calcification but easily miss small ones. Plain radiograph would be more helpful, but the MRI may have revealed it as well. Either way, shockwave therapy isn't harmful in the absence of calcification--it's just not helpful.

    Edit: when a radiology report says something isn't present, there's always an implicit caveat that the finding isn't present within the context of the modality and images obtained. So an ultrasound report can state there are no calcifications while a plain radiograph can report the presence of calcifications without being inconsistent. Obviously very confusing to patients and people unfamiliar with medical jargon, but clarifying this in reports would make them sound even more qualified, "hedgey", and annoying to read than they already are.

    • rafterydj 2 hours ago

      I feel like I'm going nuts.

      There are other commenters saying this is a good practice they've also done for other injuries. You are saying you are an actual radiologist and immediately clock the problems with its advice.

      I have seen this pattern over and over again. Anytime someone is an actual expert at anything, AI output appears insufficient or incomplete or outright misleading. It is only when you do not know what the AI is being asked to do is it likely you will find the output helpful.

      This is itself alarming to me, but no one else seems to find this to be quite damning for the AI services being offered, preferring instanced to be wowed by the convenience and speed at which they can be delivered unreviewed and unproven information.

      • appplication an hour ago

        This is the root of AI psychosis. There’s a lot of unpack here, and I won’t go too deep because you can’t really have a discussion with affected folks because their fundamental basis is not evidence, it’s belief.

        It is weirdly religious in a way, because if you were to present contrary evidence (e.g. experts in a field weighing in about how plausible sounding responses are bunk), you would only be told you don’t believe enough in the long term potential and capabilities.

        Don’t get me wrong, I think we all agree capabilities will eventually improve (and farther-future capabilities could reasonably surpass experts), but really is unclear if the current transformer architectures with their probabilistic/hallucinatory outputs will plateau before they surpass current experts abilities in all promised fields.

        • sublinear 14 minutes ago

          Human expertise is also improving all the time and not limited to just connecting dots. When AI seems to surpass a particular human, it's just because the human lacks broader knowledge and fails to investigate further.

          An expert already knows they don't know everything. That was never the point. Critical thinking cannot be delegated to AI any more than it can be delegated to a book. There is nothing new going on here.

        • lazide 37 minutes ago

          I don’t think they will improve, there is too much incentive to poison the datasets going forward.

          A lot of the models up to this point have been benefitted - like Google did - from essentially ‘pre SEO’ internet.

          Now the same tools are being used to generate nigh infinite good sounding bullshit, which poisons the dataset in all sorts of hard to detect ways.

          To add insult to injury, the human experts are also not as. Naive, and have many incentives to poison their own input in subtle ways too.

          • brokencode 11 minutes ago

            I seriously doubt that data set poisoning will be a real limiter in model performance.

            For one, if your website/book is poisoned, who is going to trust it for anything at all, much less for training models?

            For two, all the major AI labs hire or contract for subject matter experts to create curated data sets, evaluate model performance, etc.

            Unless they hire malicious experts, this will provide a growing, high quality data set that should drown out any poisoned pretraining data.

            • microgpt 3 minutes ago

              Pretty easy to display one thing to verified browsers (just latest few user-agents from the 10ish different mainstream browsers on the 3 main OSes) and another to anything else.

              Yes AI scrapers can easily spoof user-agent, but they fall out of date as the browser updates.

              Bit harder to catch them in tarpits and then serve nonsense to whoever ever triggered the tarpit.

          • rvnx 30 minutes ago

            Human doctors use LLMs to diagnose too

            OpenEvidence claims

                "More than 40% of U.S. physicians use it daily, and it handled around 20 million clinical consultations per month. Over 100 million Americans were treated by a doctor using it in 2025."
            
            https://www.cnbc.com/2026/01/21/openevidence-chatgpt-for-doc...
            • something98 17 minutes ago

              This is a very misleading statement; most of those physicians are using LLMs to transcribe notes from visits and/or for billing purposes (e.g., proper billing codes).

              • brokencode 9 minutes ago

                OpenEvidence is specifically meant to help clinicians make evidence-based decisions in the diagnosis and treatment of patients, not note transcription.

            • sarchertech 5 minutes ago

              Ignoring the fact that this number comes from a company press release, it doesn’t say anything about the number of doctors using it to diagnose, just that they use it.

              If a physician uses Google to search for a dosage chart for some drug they rarely prescribe, you wouldn’t say they are using Google to diagnose the patient.

      • qnleigh 27 minutes ago

        Totally agree. I'm a scientist, and like most scientists I have some specialized skills that most of my colleages don't. AI has empowered them to learn and build things that they might have otherwise needed me for. But there have been quite a few cases where it led them very far down a wrong path. This has started happening way more often in the last few months.*

        We've known since the beginning that AIs confidently say incorrect things. But now that they can speak confidently about very complex topics, and mostly say correct things, we are letting our guard down and lots of subtle falsehoods are slipping through.

        *In one case, I was able to put things back on track because the AI suggested my colleague talk to me; somehow it figured out we were co-workers.

        • bitlad 22 minutes ago

          >very far down the wrong path.

          Absolutely agree. Have seen this first hand

      • mattgreenrocks 2 minutes ago

        You're not. This site was also bullish on using LLMs as therapists, which defeats the very point of them, and reflects a lack of knowledge on what exactly therapists do for people.

        More on topic: if the article's author arrived at a definitively negative result would this have shown up on HN?

      • sbarre 2 hours ago

        > Anytime someone is an actual expert at anything, AI output appears insufficient or incomplete or outright misleading

        Yes, this is exactly so. AI is able to confidently sound plausible enough to convince laypersons or anyone who isn't very familiar with the subject matter, which is a big part of the mass-appeal "magic" of ChatGPT and other similar tools. It's like having a know-it-all friend (who also makes shit up to bridge their own knowledge gaps).

        In many non-advanced non-specialized situations, AI is right enough to be at best useful or at worst not harmful (usually landing in the middle somewhere).

        But speaking for myself, in areas where I consider myself quite proficient, I can very easily spot the subtle inconsistencies and naive conclusions that AI responses provide, and I have to guide/steer/correct it a lot to get good results when the subject matter is complex enough.

      • meowface 31 minutes ago

        I may be missing something, but I think it's unclear that the parent poster here is necessarily actually contradicting anything the AI said. It may depend on the exact information the OP wrote to Claude and GPT. The full transcripts would be needed. (Though there is definitely a separate point that a doctor would generally better know all the right questions to ask, while current LLMs may be making certain assumptions.)

        The LLM may have, from its "perspective", implicitly thought the OP was telling it that he had strong reason to believe there was no calcification and was not considering the bigger picture of possibly receiving an incomplete/poor assessment from the medical staff. In fact, the issue here may be the LLM overly trusting doctors vs. trusting its own expertise.

      • sxg an hour ago

        I see your argument, but it's not exactly news that an expert found a flaw in a popular tool. You could say the same about Wikipedia--experts have tons of issues with it, but Wikipedia still provides value to non-experts. The most likely alternative to Wikipedia for non-experts is simply not trying to learn anything new.

        Similarly with LLMs, you can't just write them off entirely because they sometimes provide misleading or incorrect advice. The positive utility maximizing view is to learn when you need to call in an expert. I recently moved in to a new house and have used Claude extensively to figure out basic things (e.g., adjusting the garage door height, how to mount a TV). However, when the HVAC suddenly stopped working, I gave Claude a shot for an hour and tried some non-destructive fixes, but then realized I had to call in an HVAC expert.

        • frereubu 23 minutes ago

          Slightly OT Nitpick: in regard to experts and Wikipedia, when doing a neuroscience-adjacent MSc, experts in the field actually directed me to Wikipedia as an excellent source for high-level neuroanatomy, including recent research, so I'm not sure your blanket description about experts and Wikipedia is correct.

        • ohyes an hour ago

          The free alternative to Wikipedia is the library, not “don’t learn anything new ever”.

          I find Claude is surprisingly similar to a confident but incorrect coworker, with the benefit that Claude will reevaluate when I correct it.

          • sxg 36 minutes ago

            I used the phrase "most likely alternative" intentionally. The library is where people should go to get answers in a world without Wikipedia, but the vast majority of people won't. So in practice, most non-experts either learn from Wikipedia or don't try to learn anything at all.

          • bflesch 41 minutes ago

            Claude will do everything to retain you as a user, because that's one of their most important metrics.

      • ffsm8 16 minutes ago

        > This is itself alarming to me, but no one else seems to find this to be quite damning for the AI services being offered, preferring instanced to be wowed by the convenience and speed at which they can be delivered unreviewed and unproven information.

        This point is being raised in literally all discussions about llms for the whole last year, if not longer.

        What it omits is the fact that these people getting suckered into the ai psychosis are using non-specialized models without an agentic loop while knowing nothing about the topic they're using the ai for.

        That's down to the fact that this tech hasn't really been integrated yet and people are using them widely (and) irresponsibly, but it's not necessarily something you should blame LLMs for - the cause is likely more down to the model providers marketing and our collective tendency to like self affirmation / thinking they themselves know best.

      • nlawalker 2 hours ago

        > no one else seems to find this to be quite damning for the AI services being offered, preferring instanced to be wowed by the convenience and speed at which they can be delivered unreviewed and unproven information

        "Be wowed by the convenience and speed", or merely "take advantage of the mere availability"? What most people find to be damning about expert advice is that they simply can't get it anywhere, at any cost that they can afford.

        • whatever1 2 hours ago

          So if you want to do a surgery but you don’t see any surgeons around you ask a grocery butcher to have his way?

          • sxg an hour ago

            In certain circumstances, the answer is yes. If an airplane's pilots are incapacitated, do you simply give up and crash the plane because there are no other pilots on board? Or would you rather have someone on the ground try to coach a passenger into at least attempting to land the plane?

            • frereubu 27 minutes ago

              That's an extreme edge case, which I don't think is in the context of the concerns in this thread.

              • sxg 13 minutes ago

                The specific case doesn't matter--it's meant to make you think about the general question throughout this thread: when an expert isn't available, should non-experts use AI (or other tools) to help themselves? Sometimes the answer is yes because the potential benefits outweigh the potential harms (if any harms exist). But sometimes the answer is no because misleading/incorrect advice can cause a net harm.

            • close04 25 minutes ago

              A passenger crashing the plane while trying to avoid a certain crash doesn’t make things any worse. An incompetent doctor trying to save you from certain death can make things so much worse. It’s all about weighing the best/worst outcome compared to where you are now.

            • ChrisMarshallNY an hour ago

              As long as that passenger didn’t have the fish.

            • jancsika 27 minutes ago

              You can choose a) a calm, level-headed passenger who knows they aren't a pilot, or b) a calm, level-headed passenger who almost has their pilots license but has a medical condition that prevents them from admitting when they lack certain knowledge.

              Who do you choose to be coached by an expert on the ground?

              • rvnx 25 minutes ago

                No thank you, I will ask Claude and then ask ChatGPT to challenge me, and do a couple of rounds like that.

                The first: Has no clue about anything and therefore no useful knowledge and cannot challenge me

                The second one: Is proven to willfully give wrong information and does even basic mistakes.

                The LLMs will do their best, even if imperfect, since they summarizes what appeared in books.

                I prefer to be grounded on what Airbus / Boeing manuals, or on what pilots training book said, than two far more unreliable sources.

          • EA-3167 an hour ago

            People, especially in medical crises, are desperate for answers that they often can't get because their clinicians don't know. The illusion of an all-knowing guru who sounds like their doctor and tells them ANYTHING is extremely alluring. If you're waiting to hear back from a doctor about test results (which these days probably showed up on your online account the moment they were completed) can be agonizing.

            Ok for pain in your shoulder it might not, but how about a woman with a lump in her breast waiting for the mammogram interpretation? How about someone trying to understand disturbing lab results? People are also often pushed these days to move through visits with doctors at a breakneck speed, but the AI will "hear you out" all day.

            Part of this is a problem with the AI, part of it a problem with our healthcare systems, and part of it is simply human nature. If you think that OpenAI, Anthropic, Google and the rest weren't aware of this going in you must have very little faith in the intelligence of their members. It's not hard to imagine the future of LLM's should involve a hell of a lot of liability on the companies running it, but for now it's the Wild West.

            • bilsbie an hour ago

              > but how about

              Whatever scenario you come up with my answer is the same.

              As an adult I’d like to be able to choose what tools I use to learn about my condition regardless of how well it works or even if it’s likely to mislead me.

              There’s risk in every aspect of life and we can’t baby proof everything.

              • baconmania 36 minutes ago

                >choose what tools I use to learn about my condition regardless of how well it works or even if it’s likely to mislead me.

                Even if it "works" so poorly that you're not actually learning about your condition?

              • EA-3167 32 minutes ago

                If it's helping you learn about your condition then sure I agree. The issue here is that's not really the case, it's giving you the illusion that you're learning about your condition while feeding you hallucinations and half-truths at best. A recent look at medical advice from these things showed they're no better than a coin flip.

                So if you MUST have answers that are at most random guesses, I'd suggest saving a few bucks and asking a coin before flipping it.

      • highfrequency 2 hours ago

        Seems natural enough. There will always be complexity and nuance that is missed by an AI model or person - the world is just super detailed. The more expertise you have the more you will be aware of that nuance. That doesn't mean the model or person is not useful as a starting point.

      • je42 31 minutes ago

        The question is how far is AI off compared to the professional that we have access to. World best experts are not accessible to most of us. :(

      • kryogen1c an hour ago

        On the flip side of this problem, novel best practices lag the medical standard of care, other human failures like corruption and competing priorities notwithstanding.

        For example, we had to advocate for certain practices during the birth of our first child that became routine during our second several years later.

        So, neither side is guaranteed correct, doctor or citizen researcher (which did not include LLMs in my case, for the record). The truest answer is also the most useless one, applicable to all fields: it depends.

        The real question is: if you embrace being a layman, whom do you trust more: LLMs/the internet or experts, like doctors? I think the answer is pretty clearly experts.

      • jstummbillig an hour ago

        No, not anytime someone is an actual expert at anything, AI output appears insufficient. That is why experts in various fields use AI.

        Then to say "Aha, but all of that is AI psychosis" makes obviously no sense: Why would we trust experts when they offer critique but not when they say "this is helpful"?

        Overall: People are not insane. AI makes mistakes and, often, fails completely. AI also helps them do things better, quicker, increasingly so. The jaggedness of AI is confusing and real.

        • torben-friis 38 minutes ago

          How many times have you seen an expert go "yeah these results are good consistently enough for a non expert to trust them without expert assistance"?

          There is a huge difference between having a chance of a good result, which can be useful for experts able to filter out the bullshit, and consistent success. I would generate code as a helper, I would never allow a guy from marketing to merge unreviewed AI code.

          • hectdev 25 minutes ago

            That's what I would like to call job security. When you know how to read what is wrong, you can easily catch the mistakes and correct it. AI gets you there faster by doing a lot of things right and you correct the mistakes.

        • lazide 32 minutes ago

          I’ve never seen an expert use AI in their field beyond the initial ‘oh interesting’ stage.

      • tomaskafka 39 minutes ago

        Yes. The PM’s “with AI I know enough to be dangerous, haha” means “I’m actually dangerous and I don’t realize”

      • beering an hour ago

        TFA doesn’t actually state where the bit about shockwave therapy came from and it wasn’t the main point of the article. The concern was about being given useless therapies. The homeopathic analgesic is concerning, at least to me.

        I.e. nothing this radiologist said was related to the LLM’s advice.

      • Hikikomori 21 minutes ago

        It's like reading news articles. Seems reasonable until you read an article about something you know, then you see how wrong they can be.

      • suttontom 36 minutes ago

        Your instinct is correct, and in a lot of cases it's true. However, I've heard from enough doctors by now (a cardiologist, psychiatrist, and epidemiologist/former physician) that they use medical LLMs and find them extremely helpful, mostly as a way to either bring up knowledge they'd forgotten about or as a way to learn something new and then verify it. I'm extremely skeptical about LLMs in general and the connection to Gell-Mann Amnesia is apt, but I wouldn't necessarily write them off completely like that. There are experts using the models that find them genuinely helpful in their field.

        • GTP 28 minutes ago

          Probably this is the point, and it's a point that has been brought up a lot of times in the past, maybe less in recent times: you need to know the things you're applying an LLM to. In this way, you can keep the good outputs while having the expertise to discard the bad ones.

      • parineum 2 hours ago

        > I have seen this pattern over and over again. Anytime someone is an actual expert at anything, AI output appears insufficient or incomplete or outright misleading.

        AI isn't even the first instance of this phenomenon, news articles are like this as well.

        https://en.wiktionary.org/wiki/Gell-Mann_Amnesia_effect

      • newsclues 2 hours ago

        LLM is not necessarily an expert system. Once there are expert systems for law, healthcare, accounting, governance…

        https://en.wikipedia.org/wiki/Expert_system

      • meindnoch 39 minutes ago

        We're past the point of Gell-Mann amnesia. This is full blown Gell-Mann psychosis.

      • grayhatter 22 minutes ago

        > This is itself alarming to me, but no one else seems to find this to be quite damning for the AI services being offered, preferring instanced to be wowed by the convenience and speed at which they can be delivered unreviewed and unproven information.

        Welcome to the club? This new awareness you've found over the true quality of LLM based GenAI output has been what "all the haters" have been mad about for-ever. That the output of LLMs are clearly defective, and merely have found a cute trick towards making humans think they're less defective than they are actually measured to be.

        And the corresponding anger and frustration to push the risks of genai output out onto others, while also aggressively pushing it as a feature you should be using already. You're behind don't you know, and whatever other lie I have to tell to trick you into enough FOMO to pay me 200USD/mo so I can sell FOSS back to you.

        An LLM can only output the mean next likely token, and then add a bunch of extra noise on top of that so it feels interesting and not repetitive. None of this is new, the problem is, 50% of humans are below the mean, but have no idea. So when an LLM tells them some lie: well, it sounds so helpful! It's impossible for someone who sounds this helpful to lie to me, liars never sound confident! It must be PERFECT! I'm gonna tell everyone how perfect it is. so the bottom 0-33% think LLMs are fantastic tools that make nearly 0 mistakes in comparison to the bottom 33%. 33-66%-ish aren't sure, some times it's great, but it will make that random mistake sometimes, but I can catch most (or all of them depending on ego). and the 66%+ are angry about how many people are getting tricked by something so obviously low quality, or are lucky enough to not have to care.

      • silisili an hour ago

        This is natural and even logically expected. It's just Gell-Mann amnesia in action. The world has more people spouting on things than it has people knowledgeable in said things.

        Apply that to the Internet at large, and realize where LLMs got their training. They're basically ConfidentlyIncorrect personified.

    • foobarian an hour ago

      Huh, I'm reading and looking up these words you guys are saying and it is starting to look exactly like the symptoms I have been having with my own right shoulder! I feel like a giant gaping rabbit hole just opened up next to my desk.

    • engeljohnb 43 minutes ago

      > I'm a radiologist

      Any comment that doesn't start with this or similar qulaification should be taken with a grain of salt (yes, including this one).

      Medical imaging is one of those things everyone thinks is simple because they don't know what they don't know. I'm a cardiac sonographer, and I have to assume radiologists hear at least as many eye-rolling takes on AI coming for their job as I do.

      • lostlogin 29 minutes ago

        Ahh, AI is coming for your job.

        Full sarcasm, is there one that’s that’s more immune?

    • tiahura 2 hours ago

      Why isn’t diagnostic ultrasound used in orthopedics? They inspect fetus hearts and other organs everyday, why not shoulders? Seems much cheaper and faster.

      • sxg 2 hours ago

        They do. Ultrasound in orthopedics is a relatively newer field, and there aren't quite as many sonography techs and radiologists experienced in reading these studies, which is likely why you don't see it offered more widely.

        Edit: I should mention that ultrasound is basically unusable for evaluating bones. Sound waves can't penetrate bone, and so you end up just seeing a huge black void. That's a huge orthopedics use case that ultrasound just can't benefit. However, ultrasound is fantastic for evaluating muscles, ligaments, tendons, and other superficial soft tissues.

      • scrollop 28 minutes ago

        We order ultrasounds all the time for shoulders (for like soft tissue issues; for trauma, you'd start with an xray). For other joints, such as the knee, MRIs are a better choice (unless htere has been substantial trauma, in which case xray initially or further), though more expensive, unless you're excluding a Baker's cyst, in which case an ultrasound is fine.

        Since MRIs are more expensive, private doctor's might order them instead of an ultrasounds.

        (I'm a doctor)

      • bflesch 36 minutes ago

        It's a manual, non-standardized process without a standardized output. Image quality depends both on user skills (how deeply they press the sensor on the skin) and the machine they have. Unlike CT/MRI the examination results cannot be easily shared and compared between patients for studies.

  • Gareth321 a few seconds ago

    I have had terrible experiences with medical professionals. Especially the experienced/senior/specialists. First, they just don't have the time to do a thorough research of my medical history. Second, they are often arrogant and resistant to any kind of critical questions. They have an apparently unwavering belief that they are correct. In fairness, they probably usually are, but they are not infallible, and they are at their weakest when it comes to the edge cases.

    AI is completely without ego, and can process all my medical records in minutes. In truth, even today, I would rather have an AI analyse my records.

  • AceJohnny2 34 minutes ago

    > There's something incredibly peaceful about being in the hands of an expert you trust. [...] AI can absolutely shatter that feeling in an uncomfortable way [...] but I don't know if I can fully trust AI either.

    This really is key. We know we can't trust the AI, but at the same time we're also more comfortable asking the AI for clarifications or confronting it. Not having a time-bound appointment or paying by the hour helps a lot. But even then, more information doesn't necessarily help!

    I once brought my 11-year-old car, a Civic with 150k miles, to multiple garages. I figured I'd play the "second opinion" game to correlate what the garages recommended to decide on what needed to be done...

    I got 3 completely unrelated recommendations, including one that I knew was invalid. I felt worse off than when I started!

    The solution to uncertain information isn't more information, which the AI can certainly provide, it's better information, and AI cannot currently provide that.

    • Aurornis 6 minutes ago

      I have multiple LLM subscriptions at any given time, plus an array of local models.

      When I ask a question outside of my domain of expertise I like to ask all of the LLMs I have access to. I also create separate sessions and ask the same question multiple ways.

      It’s revealing to see how many different and contradictory answers I get, most of which are presented confidently.

      The last time I ran a medical question through Claude I couldn’t even get consistent answers between sessions.

      It’s also scary how easily you can lead each LLM to the answer you have in mind. When I would start asking questions about different options that other LLMs had presented, each session would drift toward that explanation.

    • ed_elliott_asc 7 minutes ago

      The soothing sound of ChatGPT telling us how right and clever we are…how could it possibly hallucinate, certainly not 5.5

    • 010101010101 14 minutes ago

      > The solution to uncertain information isn't more information, which the AI can certainly provide, it's better information, and AI cannot currently provide that.

      I'd argue that AI _can_ currently provide that, but that it can't do it _reliably_, and that to non-experts it's impossible to differentiate, which makes it all the more dangerous.

  • hennell an hour ago

    Personally my favourite feature of the new ai world is not when I use it directly but it's when one of my managers uses it to try to fix a problem, then issue to me their findings and I have to defend my process to someone who understands neither my process, their suggested solution nor often the problem they're solving in the first place.

    • cube00 an hour ago

      It gets worse when they try to challenge your solutions by feeding it back into the LLM and sending it on to you, arguing with an LLM is exhausting, arguing by proxy with a human parroting its responses is excruciating.

      On the plus side when they do this they can't flood your calendar with those "quick chat" meetings because they know they won't be able to hold a conversation on the issue beyond the first minute.

    • willsmith72 an hour ago

      True, but this was a problem long before AI (read this article, met this guy at a conference who told me x, my boss said blah)

      AI probably exacerbates it but crappy managers exist regardless

      • nitwit005 an hour ago

        Before maybe you had to deal with someone hiring schetchy consultants once in a while, but now the managers have a limitless well of dubious answers to draw on at any time.

        • darkwater 28 minutes ago

          But now you have a new tool in the upmanagement toolbox: subtlely tell them to implement their idea in prod with Claude Code, and see it for themselves.

  • dwa3592 6 minutes ago

    Was it 2016 when Geoff hinton said that radiology was a dead career?

    Well, we now have the best model of our time (trillions of $$$ of investments) telling us something completely different(and wrong) from a human expert. I would really like someone calling out dario, sam, elon on these things and hear their explanations but alas, a man can only dream.

  • throwforfeds 40 minutes ago

    I've seen a lot of friends and family members almost immediately get offered surgery for shoulder pain. It's just often the default for people that do surgeries for a living.

    I also had a pretty painful shoulder issue at one point, where the pain just wasn't subsiding for months. I tried massages and acupuncture as I didn't want to do surgery, but it wasn't helping at all. The thing that fixed it for me was just really focusing on doing pull-ups. I couldn't do them at all when I started, so I began with dead hangs and scapular pull-ups, eventually progressing to regular pull-ups, and then training with a "grease-the-groove" method once I could get a few per set. I stopped the training schedule once I was getting in around 17 pull-ups per set, and now just do 6 sets of about 7-8 pullups 3x per week spaced throughout the day. I'll also do some shoulder mobility drills [1].

    Whenever I get lazy about keeping up with them inevitably discomfort will start arising again, but it goes away once I get back to strengthening.

    [1] https://www.youtube.com/watch?v=vP8YmmRMz6I

    • alistairSH 31 minutes ago

      On the flip side, when I had rotator cuff issues, the surgeon recommended months of physiotherapy before resorting to the knife. And it worked. And by weight training regularly with a focus on correct shoulder movement, the pain stays away.

      It really seems like if you, as a patient, go looking for a quick fix, that’s what you’ll be offered. And if you educate yourself a bit and then go t for the best fix for you, you usually get they.

  • nostrebored 44 minutes ago

    I don’t understand the negative reactions. Medical care as it exists requires the doctor and patient to have their brains switched on. I’ve almost never had a problem where a doctor provides me with a diagnosis and I go about my day. Most of the times that I have, I’ve been confident about the problem and known what I needed. The doctor was a barrier to accessing care.

    Dr. GPT is a good brainstorming tool. It helps synthesize information in a way that primary texts don’t. But it does force you to say “that doesn’t make sense”.

    I do think that people saying “doctors don’t know the state of the art” have a weaker case. If you think about it in terms of token density during pretraining and how post training datasets are constructed, I think it would take us a very long time to adapt to any fundamental shifts. If we have forgotten how to cure scurvy, how many journal articles would it take before we adapt to a discovery?

  • jeswin 2 hours ago

    I would not trust AI on images. But I once had ChatGPT tell me that an MRI report was very likely to be incorrect based on the text, and offered a different diagnosis. Since it was semi insisting, I visited another doctor who made me do a retest. Long story short, ChatGPT was correct.

    Again, this is just one single person's experience. So not worth much.

    • nostrebored an hour ago

      I think that much of the visual gap is because what to attend to in images is less structured. Anecdotally small qwen finetunes (ie less than 10B) take task accuracy from sub 30% on FMs to 90%. We have sold some of these for outcome based back office tasks.

      I think we’ll see a lot of specialized VLMs that provide real value.

  • rasmus1610 35 minutes ago

    As a radiologist I have found Claude and ChatGPT to be absolutely terrible at MRI and I would not trust it one bit. It has its merits if you need to research stuff that is more text based, but radiological images is just something that they cannot interpret good enough (yet)

    • lostlogin 25 minutes ago

      AI makes up for its poor reporting by enhancing the images.

      Current Siemens MR software ‘Deep Resolve’ makes up the signal (adding about 50%), then makes up every second pixel, and then, for 3D sequences, makes up every second slice. It’s locking about 59% of the time off each sequences. And it’s really really good. I’m an MR tech.

    • pickleRick243 18 minutes ago

      It's like people who expect ChatGPT to be really good at chess because chess engines with super-human performance have been around for decades, so obviously the latest frontier LLM that took billions to train should find the task trivial.

      Actually, I'm curious what ChatGPT 5.5's ELO is- I wouldn't be too surprised if it's 2000+ just from its basic understanding of chess principles from all the content it has digested.

  • hectdev 21 minutes ago

    My only issue with this was the restriction of "Do not look at any data outside of our working folder" is preventing the tool from doing what it does best. I would have given it access to PubMed to pull the latest research on the subject and validate.

    I wouldn't consider Claude itself to be the tool that does a job like this, but the tool that pulls in the best data and gives a supported suggestion. And then go through a number of iterations on where it failed to hone in its assessment.

  • ricardobayes 2 hours ago

    That might be doctors new nightmare: people who second guess everything with AI. Previously it was "google your symptoms".

    • mettamage 2 hours ago

      Well I live in the nightmare that is the Dutch healthcare system [1]. There are many things that they will fix but they didn’t fix my sleep. A friend fixed my sleep. He is a doctor and prescribed me the right thing. The thing is, he shouldn’t have had to intervene. Without him I could have ended up poor and destitute as my sleep was wrecking me.

      And yea, I already did all the standard things. CBT for insomnia helped somewhat. My insurance didn’t fully cover it either, unless I was willing to wait for 8 to 12 months.

      And I recently met someone with slow moving metastatic cancer. Thanks to LLMs they will most likely live another 3 to 5 years extra since the Dutch conventional mainline treatment hasn’t been taken yet. But it is German doctors that helped them and Belgian doctors that pointed out in a second opinion that a lot more can be done.

      LLMs have a part to play. The false positives are awful, but I have seen an average of 5 out of 10 care when things become too complicated.

      Except for trauma treatment. The Dutch healthcare system is amazing once they diagnose classic PTSD.

      So it’s definitely not all bad but the trust I had when I was younger has been eroded quite a bit and LLMs can meaningfully step in, in my case at least.

      [1] I know there are worse systems. But from what I have heard there are clearly better systems nowadays. It has slipped a lot

      • simianwords an hour ago

        Hey what did you do to fix your sleep? Help us all and maybe an llm will index your diagnosis (hi ChatGPT)

        • mettamage 19 minutes ago

          For me what helped is taking 7.5 mg of mirtazapine. At higher levels it's an anti-depressant but at lower levels it's an anti-histamine. It gets me drowsy. Together with 0.3 mg melatonin it knocks me out. I only take it 3 times per week max to not have habituation kick in.

          So 3 days out of 7 days I have guaranteed good sleep. The other 4 days are a toss up. But an average of 5 days of good sleep is much better than 3.5 days out of 7 days.

    • raincole 6 minutes ago

      People should've googled their symptoms and especially the prescriptions they got. It has always been a good practice. If[0] AI proves to be the new google then people should ask AI too.

      [0]: IF.

    • bilsbie an hour ago

      It’s funny every profession deals with customers making their own guesses at diagnosis.

      I told my mechanic the film flam is broken but he said it was the rim ram. He fixed it and we all went in with our lives.

      But doctors insist on this God like status so it’s a “nightmare” when patients try to help themselves.

    • weatherlite 2 hours ago

      Nightmare because they're always right and the A.I second guessing is always wrong, or because they just don't like to be second guessed?

      • tuvix 2 hours ago

        There’s more than two options here. It was already difficult to deal with self diagnosis for doctors, now we have a machine that outputs recommendations, and does it with confidence whether it’s correct or not.

        The same issues that were present with search-engine self diagnosis are still present with LLMs. If you provide Google with an incomplete list of symptoms and can’t interpret the information you find correctly, you will likely get an incorrect diagnosis. The same is true for LLM output.

        • rvnx an hour ago

          There are quite a few disclaimers everywhere that soften confidence: "always ask a medical specialist", "I'm not a doctor", "this could have been this or that but really not sure", etc.

      • drw85 2 hours ago

        Nightmare because the AI is just generating a random text that fits the question.

        • Legend2440 2 hours ago

          This is not a fair assessment of what AI is doing.

          Studies have found that newer reasoning AIs are about as good at diagnosing illness from a written description of symptoms as doctors are.

          Granted, it cannot actually examine a patient, so we're not replacing doctors anytime soon. But your view is obsolete.

          https://www.science.org/doi/10.1126/science.adz4433

          • Retric an hour ago

            They are using the “gold standard for the evaluation of expert medical computing systems” not a proxy for what a doctor actually does when diagnosing someone.

            It may have some utility after diagnosis, but doesn’t demonstrate utility for patients.

          • snackerblues 41 minutes ago

            It will also tell you you're God and/or a toaster. If you're gonna let benchmarks convince you to listen to an LLM on matters of health it's your funeral, just don't get anyone else killed with you please.

        • betaby 2 hours ago

          I feel the same when visiting a doctor in Canada. In that 2 minutes I have with they in one appointment per year I hear a standard text.

        • d1sxeyes 35 minutes ago

          Not quite. An LLM generates text that would likely follow. The sky is… “blue”. A patient in pain with a bone protruding from their shin has a… “broken leg”.

          The more training data, the more questions it can answer with a reasonable degree of probability of accuracy.

          Throwing away a potentially useful analysis just because it’s probabilistic seems a bit like throwing the baby out with the bath water.

        • poszlem 2 hours ago

          This is a very peculiar use of the word "random".

      • vimda 2 hours ago

        Nightmare because users approach LLMs with the false confidence that they're always right, and present LLM outputs as fact to Doctors who have to waste time explaining that it's wrong most of the time. It hurts more than it helps.

      • mixologic 2 hours ago

        Its a nightmare because it erodes trust. Doctors are not "always right" which is why "always get a second opinion" is codified in culture.

        But AI's problem is that its completely full of shit, sometimes, and the people most qualified to evaluate whether its full of shit are the doctors, not the patients, but just like OP's original article, patients are left feeling like their second opinion from AI might be more trustworthy than their doctors opinion.

        • simianwords an hour ago

          The notion that only doctors can verify is false! Doctors are better at verification but normal people can also verify. This is just empirically true.

          Examples of things normal people can verify

          - procedural errors that Claude can capture like some blatantly high dosage (grams instead of milligrams)

          - outdated treatment plan, maybe there’s a credible new treatment plan that’s been used for years but the doctors were not updated

          - literally being injected homeopathic drugs (takes no smart person to flag this)

          Let’s stop talking as if doctors have a divine right here. And let’s accept some agency.

    • js2 an hour ago

      The NYT did this profile a while back: "Ben Riley was already writing about the risks of chatbots when his dad started trusting A.I. over his doctor."

      The dad was a retired neuroscientist who delayed cancer treatment against medical advice because he was certain he had been misdiagnosed based on his own research that he did with the help of A.I.

      https://www.nytimes.com/2026/04/13/well/ai-chatbots-cancer.h...

      There's a comment on the article from Ben Riley:

      > I am very grateful to Teddy Rosenbluth for sharing my father's story with the world, her kindness and curiousity proved to be restorative in ways I didn't anticipate.

      > The two words that everyone used to describe my dad: "intelligent" and "kind," and he was indeed both of those things. The sad irony here is that it was his human intelligence, combined with these strange new tools that purport to be a form of 'artificial' intelligence, that led to his ill-advised decision to forego the treatment he needed for his CLL. A doctor has already commented on this story with the observation that AI "confidently asserts erroneous conclusions," and we simply have no idea how often this is happening or the magnitude of the harm that results.

      > Not a day goes by that I don't feel the pang of my father's absence. He might still be here if not for AI. I try not to think about that, but sometimes I can't help myself.

      • rvnx 33 minutes ago

        The context is very important: decades of a poorly-diagnosed chronic illness had left him deeply distrustful of the medical system.

        This is the real root issue.

        At 75 years old, he was stubborn. Is that reasonable ? Yes, perfectly. Could he have been right since the beginning ? Certainly. Did he deny evidence ? Yes.

        Zero doubt that he was intelligent, everything points toward that direction, but that doesn't make a person less stubborn, because accepting the evidence, is also accepting that you were wrong if you initially postured yourself as adversarial instead of cooperative.

        He would have read Wikipedia, scientific papers, etc, even without AI.

        He did not want to be convinced. It works both ways:

        https://www.foxnews.com/health/woman-says-chatgpt-saved-her-...

        or

        https://www.today.com/health/mom-chatgpt-diagnosis-pain-rcna...

        Nonetheless, someone very smart, just didn't want to move from his position.

      • ieie3366 an hour ago

        GPT-4o, which is what that article is most likely about, was an older low param count slop model which was known for abusing emojis and sycophancy. It does not really have any relevance to latest claude frontier models.

        Your comment is akin to saying "Karen from facebook who is a human pushed essential oils and ivermectin as a cure to cancer. Now doctor Y is suggesting chemo. Both are humans, humans cannot be trusted!"

    • nosioptar 2 hours ago

      I asked a clanker about symptoms I was having. (I'm not an idiot, I was already on my way to hospital, clanker was just to take my mind off symptoms during the drive.)

      The clanker said I'd be fine, I just needed some rest and OTC meds.

      The medical staff immediately turfed me to surgery because the same set of symptoms I told the clanker were enough to concern them that I needed emergency surgery.

      Had I have listened to the clanker, I'd be dead because I did need emergency surgery. (Hell, I almost kicked the bucket because I waited for someone to wake up to give me a lift because.my insurance probably doesnt cover an ambulance ride.)

      • throw310822 2 hours ago

        Very curious what made you run to the emergency first thing in the morning that an LLM understood as "just normal, take some OTC meds and wait".

    • ilovecake1984 2 hours ago

      Indeed. I don’t even get what OP thinks they are getting out of this other than doubt.

    • gruntled-worker 2 hours ago

      This is obviously going to happen. But sub-par and sloppy doctors are a thing too. Medicine has been using semi-intelligent systems for years that were nevertheless found to improve outcomes.

      We need studies that quantify error rates from each source type, then we need to account for the fact that the artificial type will keep improving.

    • consp 2 hours ago

      It can be helpful in your understanding the choices made by asking questions and thus in reassurance, but it requires something most people lack: understanding you are likely wrong since you are just collecting information without understanding it.

      Pretty much the like most manager these days, so I understand the frustration of the GPs.

    • gib444 23 minutes ago

      It's so much worse than some Google results: people see LLMs as a trusted friend who never talks back and never questions you, who is excellent at convincingly communicating their bs, reeling you in with "tell me more so I can really lock this down", continuing to fool you

      A con artist, a fraud

    • SeriousM 2 hours ago

      And say it's true because the AI said so.

    • rvnx an hour ago

      No, this flow is actually very good.

      Like any domain, when you have questions or need a solution, you make research first, then you ask a specialist.

      If you explain well the symptoms and context you can have proper advices and then decide on the path next:

          Case A) It looks benign and advices / information that you collected seem reasonable, then you go your way.
      
          Case B) You need second opinion of a specialist because the subject is too complex, or there are medications that you need approval.
      
      Once you have challenged LLMs, and read about the topics over and over then you genuinely become really good at understanding it (especially if you triangulate over LLMs and ask them to challenge, you start to have genuine questions). No matter if the answer is right or wrong, you have elements. Maybe you missed the point, but you come prepared.

      At home you have the time to assess the options, pros and cons of each approaches, the possible questions to ask and then challenge the doctor.

      Shared decision-making is an actual evidence-based model of care, and patients who arrive understanding their condition and carrying specific questions tend to get better attention and better outcomes.

      Some doctors get annoyed, because they have big ego and choose to be patronizing, but it is exactly their job to answer such questions.

          With LLMs, it's quite good, you get nuanced and rather useful answers.
      
          Before LLMs, no matter the topic you searched for, the answer was the same: "you have cancer / an [obviously deadly] rare disease"
      
      The other problem, in many places:

          • The doctors are not affordable
          • They are too busy for you (< 15 minutes)
          • You may need to wait months to get an appointment
          • They are not good (country-side is an example, and sometimes even country-level)
      + you can have all of these factors together.

      So, you have something deeply bothering you, your only appointment is in 4 months. It would be insane not to take the time to explore different solutions and not to come informed about the topic.

      If you express your prompt properly and do not rely on imagery, you can absolutely have top-tier advices.

  • linsomniac an hour ago

    ~2 years ago I used ChatGPT "deep research" to investigate a chronic sinus infection I'd been fighting for ~3 years. After seeing 3 GPs and 3 visits with an ENT, I fed all the observations I had into the AI. In particular, I couldn't get the ENT to explain why he visually saw, via a scope, evidence of allergic reaction in my sinuses, but then later concluded, after an allergy test, that it couldn't be treated via allergy medication. I asked this question a few times and he just never answered.

    ChatGPT surfaced a NIH study that concluded that 20% of people have allergic reactions that are isolated to a body location, and that shoulder "skin prick" testing may not reveal. I asked him about that and he said "that's not how allergies work". Full stop. He was unwilling to even look at the study.

    He prescribed a CPAP and regular nebulizer treatments. Side story: the CPAP place sent me a SMS message that I couldn't recognize was not a phishing attempt, and when I reached out to inquire who they were they never replied.

    So I decided: Let me just try taking a second-gen allergy tablet every day and see what happens.

    My sinus infections have gone away. Previously I was getting a major sinus infection at least quarterly. Maybe he's right that allergies don't work that way, but allergy tablets have absolutely solved my problem. Which I'm thankful for because I tried a CPAP for a solid month a few years ago and I just could not get used to it, and was sleeping like crap.

    • nostrebored 41 minutes ago

      Daily allergy tablets are associated with huge increases in early onset Alzheimer’s. Glad you found something that works, but might be good to get some of the allergen injections :)

      • cenamus 32 minutes ago

        Where are getting that from?

        All I can find is about 1st gen antihistamines (i.e. Benadryl, which I doubt many people take daily, because of the drowsiness).

        Even for those, evidence seems to be mixed at best. "Huge increases" seems like hyperbole.

      • tnchr 33 minutes ago

        I believe it depends on which ones, the older gen or certain classes of antihistamines

      • darkwater 24 minutes ago

        Wait, what?? Now I'm getting in panic mode because I do take regularly anti-hystaminic tablets/pills (the newer ones, based on ebastine because they don't make me feel sleepy)

      • meindnoch 23 minutes ago

        Misinformation.

        Only first-generation antihistamines with anticholinergic effects are associated with cognitive decline in elderly patients.

  • dazhbog an hour ago

    You should always be getting a second or third opinion from real doctors for matters like surgeries, radiology, etc.

    One doctor diagnosis + LLM is gonna throw you off. You need more datapoints.

    • ChrisMarshallNY an hour ago

      In the US, this is standard advice. I note that the OP is in Germany. Maybe they do things differently, there.

  • TSiege 2 hours ago

    Always worth a share for this scenario. It's not clear if LLMs are capable of doing actual analysis on medical imaging. For details see this article https://futurism.com/artificial-intelligence/frontier-models...

    > As detailed in a new, yet-to-be-peer-reviewed paper, a team of researchers at Stanford University found that frontier AI models readily generated “detailed image descriptions and elaborate reasoning traces, including pathology-biased clinical findings, for images never provided.”

    > In other words, the AI models happily came up with answers to questions about a supposedly accompanying image — even if the researchers never even showed it an image.

    > As opposed to hallucinations, which involve AI models arbitrarily filling in the gaps within a logical framework, the team coined a new term for the phenomenon: “mirage reasoning.”

    > The effect “involves constructing a false epistemic frame, i.e., describing a multi-modal input never provided by the user and basing the rest of the conversation on that, therefore changing the context of the task at hand,” the researchers wrote in their paper.

    > The damning findings suggest AI models cheat by diving into the data they were given — and coming up with the rest based on probability, even if it’s almost entirely conjecture.

    • kierangill an hour ago

      I work at a telemedicine company. We’ve benchmarked a few frontier LLMs on public medical imaging datasets. One test included high-quality and high-consensus otoscopic images. We didn’t anticipate the models to do well on something so niche, but what concerned us was how poorly calibrated the models were.

      I know you can’t trust an LLM’s self-assessed “confidence” of a prediction, but I’ve found that confidence can at least be directionally correct for some tasks. For our benchmarks, however, confidence was poorly correlated. What’s worse is that binary classification models (“Do you see $diagnosis in this photo?”) highly influenced the LLM to confidently predict $diagnosis.

      I’m concerned for those using LLMs for diagnostics, and getting confidently led to the wrong conclusion.

      • nostrebored an hour ago

        But the binary classification models can be made ternary easily. RL on congruence plus penalty for misdiagnosis is easy to set up and gives great results.

        What I’ve seen be the true bottleneck is people not setting up the structured data. But making a tiny reasoning model with OPSD -> GRPO is totally doable with a bit of money.

    • appplication 2 hours ago

      It makes a lot of sense if you understand how these models work but this was a cool read anyways and studies like this are impotent for curbing the unfortunate fever dream some folks seem to be collectively having about LLM omnipotence

    • seanmcdirmid 2 hours ago

      I don’t understand how this is a different result than giving any LLM a task that is not completely grounded? I’ve observed this in coding tasks, if I forget to include a file referred to in the spec, the LLM will just hallucinate a version of it and my results suck. If I give it the file (and really, all the information I claimed it had access to), the task works fine. I fixed this in my pipeline with a prompt that does an extensive grounding analysis to determine if the assets I’m giving it are complete with respect to the spec (and that the spec is grounded as well, ie it doesn’t refer to something that is undefined).

      I wonder if the above problem can be fixed similarly? Just ask the LLM to do a conservative grounding analysis before jumping to the main task?

      • pickleRick243 13 minutes ago

        It's not different- there's a line of research and reasoning where people who don't use LLM's regularly point out issues that have been known (and more or less solved) for more than a year now (which is an eternity in the LLM space).

    • tracerbulletx 2 hours ago

      The absolute only thing that matters is if they are provided an image what's the success rate.

    • consensus1 2 hours ago

      But why should I care? If you demonstrated that a model can perform more accurate diagnoses than a doctor, but also it had this strange behavior when no image was presented, why should that deter me from using the model?

      • swiftcoder an hour ago

        Because you don’t have any way of telling if it actually used the image presented, or based it’s conclusions on a different image it made up

        • simianwords an hour ago

          Really? You know you could just ask it.

  • lucfranken an hour ago

    Why wouldn’t you as a doctor by standard run the images through a certified compliant LLM? The actual cost won’t be it and then you can see if you get any new ideas from it. See if it’s just wrong or that it spotted a little detail you missed?

    The LLM doesn’t need to be leading or whatever but then you can have a conversation with the patient. If their ChatGPT reports has differences it can be analyzed as well.

    It feels like the time constraint of the 15m doctor sessions is the thing. But if prepared immediately after the scan then why not?

    There is always time needed to factor in new developments and innovations and that’s fine. Just moving blindly work from human to LLM is wrong. But learning on and testing with all the ai tools incoming constantly won’t be a waste. There will be more and more tools in those processes outside of human judgement, better improve the workflows now to be able to test and plugin new models and systems when they are ready.

    • KaiserPro an hour ago

      > standard run the images through a certified compliant LLM?

      Because they don't exist, yet.

      In the UK MRIs and other imaging systems need two opinions. there has been a move to allow the first opinion to be ML based.

      The _problem_ is that you are basically doing grey smudge analysis, and thats fucking hard.

    • foobarian an hour ago

      I've been starting to think of LLM as a great tool for "lead generation," borrowing a term from sales. Most of the things it comes up with don't pan out, but in many cases it's things we wouldn't have thought of, or at least not as quickly. This is especially in the context of web service or SAAS outages.

    • yread 35 minutes ago

      Because they might bias you. And because you have your own brain, training and experience

  • eqvinox an hour ago

    > My hope is that in a couple of model generations, we'll trust AI to review MRIs the way we trust it to proofread our emails.

    https://www.nature.com/articles/d41586-026-01947-1

    I've started asking my doctors whether they use AI, and if they say yes look for another one.

    • rmbyrro 43 minutes ago

      That study seems to be confounding factors and rushing to a questionable conclusion.

      A very plausible explanation for the adenoma detection rate to have gone down is simply that its prevalence went down among the population in the second three-month period.

      This was not a randomized trial. Concluding that "AI usage degrades physicians' skills" is questionable at the very least.

    • throwatdem12311 an hour ago

      I don’t even trust AI to proofread my emails.

  • LogicFailsMe an hour ago

    I did the same exercise here with medical reports and CT scans for a friend's cancer diagnosis and I got ahead of the oncologists predicting they were about to be cured. Spoilers: yep, cancer free now.

    And well, yes, I have the appropriate life science degrees to navigate clinical trial reports and research publications, and that was likely indispensable for steering Claude Code where it went, the radiologist's caution is merited here. But it's just not amateur hour for me to do this, it's 2 decades of academic research in my rearview mirror.

  • darepublic 44 minutes ago

    I would like if we could have a site where you submit your MRI then doctor commenters anonymously post their opinion. In general I want a forum where.. when people come with questions for which there are varying opinions we don't just have people leave their 2c and then jet. The thread persists, duplicated ideas get merged, erroneous statements get purged and gradually we refine shining truth

    • lostlogin 15 minutes ago

      I’m wondering how many radiologist want to work all day, then come home and work.

      Many can get paid fee-for-service for after hours work, so would probably prefer that.

  • Aeolun 2 hours ago

    I would not use Claude to get a second opinion on anything that’s an image.

    • rmbyrro 39 minutes ago

      I agree with you for some kinds of images, but not all.

      LLMs are the best PDF-to-markdown converters, in my experience. I have a CLI that converts PDF to PNG, then run a background agent to "read" each PNG and write it down as markdown; it works flawlessly even for complex math formulas, it can "translate" complex charts, graphs, and tables into words.

      It's slow and arguably expensive compared to traditional OCR, but very effective and precise.

    • maxall4 2 hours ago

      Especially an MRI which is a 3D medium —something current LLMs are very bad at.

      • lostlogin 7 minutes ago

        > MRI which is a 3D medium

        The finer detail (which you may already know) is more complicated.

        MR does ‘2D’ scans which are a slice, then a gap of non-imaged tissue (typically 10% the slice thickness) then a slice. Each slice is an image with a number of pixels, say 320. Each pixel in the slice is small, eg 0.5mm but very thick due to the slice being thick, which is required for MRI signal. The pixels are 3mm in the shoulder scan done here.

        ‘3D’ scans don’t have a gap between slices, and are often isotopic, meaning the same resolution in all directions. The voxel (a pixel with depth) would be something like 1mm x 1mm x 1mm.

        3D scans are slow, prone to movement artifact and never as pretty in plane as a good 2D. You can reformat them to look ok in any plane.

      • amluto 2 hours ago

        I know little about radiology, but MRI is a 3D medium. I would not be at all surprised if one could slice an MRI the wrong way to produce a 2D image that fails to show a feature that exists in the source data.

    • yolo3000 2 hours ago

      I used it on an ankle fracture xray, it was quite useful to make sense of things. But not like a 2nd opinion.

    • behnamoh 2 hours ago

      What's wrong with Claude? I've asked it to analyze images and even Opus 4 would perfect nail it.

      • nostrebored an hour ago

        Claude is the worst FM at image understanding. Prior to gpt-5.4 the only usable models were Gemini and Qwen.

      • throwrioawfo 2 hours ago

        Sure, it can see obvious stuff in images, but as far as I'm aware it is not designed for (or tested on) performing the kind microscopic analysis that radiology involves

  • jochem9 2 hours ago

    Right now the article reads as "AI can play doctor if you give MRI scans".

    If the author would actually go for a second opinion (maybe bring along the AI to let it explain it's findings), then the article could read as "AI did MRI analysis and proved my doctor wrong" (or: "AI did MRI analysis and failed").

  • intoXbox 2 hours ago

    Radiologists very often have to weigh up different theories, guidelines based on the symptoms. The certainty of their diagnosis is their added value, or if they don’t know they will tell you why.

    An AI telling you it could be X or Y because theory ABC… is the academic answer and a luxury clinicians don’t have. AI doesn’t give you what you want. I don’t see any added value in using generic AI models for this

  • skybrian 2 hours ago

    Getting an actual second opinion seems like the next step?

  • VladVladikoff an hour ago

    Hey OP my wife had a subscap tear and went through with surgery. Recovery was ROUGH, she couldn’t use that arm at all for almost two months. It’s amazing how much this can cripple a person, we don’t realize how much we use both our hands for our daily lives until one is gone. Even basic stuff like cooking, bathing, etc. If you can avoid surgery you should. Try doing the Buckburger 12 (spelling?) shoulder physiotherapy regiment. You’ll need to even if you get surgery, but this can help with tedonopathy. Also try to identify what is causing the repetitive stress and cut back on that activity.

    • busymom0 an hour ago

      I do powerlifting and couple years ago, I developed bicep tendinitis on my right arm. Even a tiny bit of weight on it while palm facing up would cause crazy pain. It was funny how I weight from lifting heavy weights to not even being able to carry a plate of food, not being able to press soap dispensers, or give a spot to someone at the gym.

      Even a tiny injury can severely cripple us.

  • davikr 33 minutes ago

    You can try sending basic chest radiographs to GPT and it'll fail at interpretation. I'd be wary of premature conclusions.

  • mootothemax an hour ago

    Can any LLM give you the rough pixel coordinates of an item it identifies in an image?

    I found that while Claude, GPT etc could describe an image, there was no way to link the description back to specific pixels in the image itself. Not even to a bounding box or segment.

  • terzioglubaris 44 minutes ago

    Hey, glad you did that , I have done the exact same think last week but the radiologist interpretation and claudes interpretation was pretty much the same ! you want my doctors number ? lol

  • cityofdelusion 16 minutes ago

    Its very interesting how people trust LLMs in domains they know little about.

    Instead, it is my experiences with LLMs in a domain that I know very well that makes me skeptical of their performance across the board. I find issues in code review multiple times a day with their output, and they are explicitly and extensively trained on this use-case, unlike with the MRI data. Sometimes I veer into other domains I have decent knowledge about (construction, carpentry, landscaping) and LLMs disappoint me there as well.

    I suppose Gell-Mann amnesia is a universal human quirk and not restricted to just the news.

  • mistic92 2 hours ago

    I have used Gemini 3.1 Pro through CLI to analyze my DICOM images. It gave me the same diagnosis as radiologists. But it was just interesting test

  • fabioz 2 hours ago

    I wouldn't trust anything from Claude here image-wise (maybe to get a 2nd opinion on the report itself and treatment it's reasonable), but also, on the cases there is something something serious, go to at least 2 different doctors and if they have different opinions go for a 3rd for a decisive vote, besides doing your own research (it's not that uncommon for hard cases to be badly diagnosed).

  • lutusp 10 minutes ago

    > There's something incredibly peaceful about being in the hands of an expert you trust. You don't have to worry anymore and can let them guide you through the process.

    > AI can absolutely shatter that feeling in an uncomfortable way ...

    I see this as a field report in a time of fundamental transition, from a world without AI, to one that accommodates/incorporates AI. For this to happen, AI will need to become more trustworthy. As for the U.S. medical system, it can't get much worse.

    I recently had a similar experience (meaning walking a fence between old and new methods), where I was told I could get an appointment with a human medical practitioner in nine months. So, to resolve my anxiety I consulted AI and got an instant diagnosis, one that was later confirmed by the inaccessible medics.

    Being a born skeptic I wasn't going to act on AI's diagnosis, I just wanted to know what was going on, resolve some uncertainty. Another advantage: an AI chatbot doesn't say, "Wait, you're on Medicare? Hmm. See you in nine months."

    Don't take this as an endorsement of AI's diagnostic abilities -- it's way too soon for that. In my case it was a slam dunk, about a condition I knew nothing about.

  • quacked 40 minutes ago

    The thing that annoys me about AI discourse is that AI is a mathematical technique of rapidly increasing efficacy, and yet everyone personifies it. It would help if every time someone said "AI" they supplemented "a mathematical method where extensions onto a very large corpus of information are statistically simulated".

    It's not true that "AI makes mistakes" or "ChatGPT is sycophantic". It's just that sometimes the simulated extensions to the training material are accurate, and sometimes they're not.

    • hawkice 36 minutes ago

      I think this draws too strong a line between the matrix-math core and the harness that uses it. Those harnesses undoubtedly were built with purpose and the systems fail to achieve that goal. Common usage says the the DMV can make mistakes, like any systems, despite the DMV itself not being a person (and it is common to allege large organizations make mistakes even when no specific individual is making an identifiable mistake). This isn't person-language it's systems/purpose-language.

      • quacked 33 minutes ago

        I understand and somewhat agree with your point, and might have phrased my comment differently. I think my main point is that experts aren't always going to beat "a dynamically simulated extension onto the training material". Often they will, maybe even usually, but sometimes they won't, and I feel like the people in this thread insisting that the experts will always know better are thinking about a competition between experts and a crazy robot instead of a competition between experts and math.

  • neilv 2 hours ago

    This could be a starting point for consulting a different human expert for a second opinion (e.g., specific questions to ask about), but I wouldn't put much trust in Claude alone on this.

    IME, on an almost daily basis, claude.ai and Claude Code are confidently wrong about something, and use polished language to assert nonsense.[*]

    If it's doing that on something easy, like factual knowledge available in text on the Internet, or programming code that can be inspected easily and follows well-known rules, and I can tell, because I understand those things... then there's no way I'm going to assume that Claude doesn't also BS when it comes to someone else's field. Especially not a field that requires some of the smartest people to go a decade of training, just to get started in the field.

    [*] And if I confront Claude with its mistakes, eventually it apologizes, and acts as if it's learned something, again mimicking word patterns it's heard real people use and mean, without meaning any of it. I wonder whether the AI user experience would be better, if LLM-ish interfaces weren't implicitly created in the image of fake-it-till-you-make-it overconfident performative sociopathic techbros.

  • late2part 2 hours ago

    If you have 2 clocks you have none.

    • mcapodici 39 minutes ago

      Or you have an interval?

  • simianwords 2 hours ago

    Everyone talking about how doctors know better or have some context that is not shown here.

    But are you all forgetting that they literally injected a homeopathic drug on the author?

    Between that and Claude sometimes hallucinating, it’s probably worth encouraging patients to take second opinion always.

  • Kapura 36 minutes ago

    I asked a bird about my father's potential prostate cancer. It gave extremely good advice.