Claude Science

(claude.com)

224 points | by lebovic 3 hours ago ago

82 comments

  • lebovic 2 hours ago

    I built one of the connected tools included in this launch (the Biomni HPC [1]), and I have spent an inordinate amount of my life working on this problem. (I also worked at Anthropic, but not on this product.)

    As other comments have pointed out, this is for data science – but it's capable of more than making plots and writing papers [2]. It has integrations with many databases and computational tools, including a researcher's institutional cluster.

    That alone is valuable. I founded a startup after struggling with this problem at a bio startup; integrating these tools and databases is hard and time consuming. If the only outcome of this product is that great APIs are built for LLMs, it will be a massive positive impact. Many databases used in computational genomics are still only accessible through FTP!

    LLMs are particularly good at navigating these tools and databases. It's often very specialized, but straightforward, work that benefits from in-context skills. Seeing an early glimpse of my former customers – bioinformaticians – using LLMs to solve this problem is what led me to join Anthropic in 2024.

    Also, this pattern isn't fundamentally constrained to data science: you can also integrate with a wet lab or a CRO for some kinds of science. This is what I'm spending my time on now.

    This type of science doesn't solve everything, but it's useful in some niches. For example, progress on many rare diseases is bottlenecked by researcher attention rather than a fundamental breakthrough.

    [1] https://x.com/phylo_bio/article/2029233694775624096

    [2] In comparison, OpenAI's science product – Prism – was effectively a LaTeX editor they acquired with Crixet.

    • aabhay an hour ago

      Can you speak to what makes this different from simply including or configuring various agent skills? Or is it simply the combination of lots of helpful defaults that makes this product useful?

    • SubiculumCode 38 minutes ago

      Connecting AI directly to the data sources (instead of just asking it to provide code that I run locally for myself) can get quite complicated in terms of meeting institutional policy, applicable law, data access-storage requirements (e.g. NIH data repositories), and can require legal agreements between institutions and the AI provider.

      I cannot touch. At least not yet.

    • Melatonic an hour ago

      Sounds like the perfect use case for some kind of framework where you have a local LLM (that can run on lower spec hardware) collaborating with the main LLM to optimise latency and all the other niche and legacy use cases ?

  • gjuggler 34 minutes ago

    The most interesting thing here is that Claude Science runs a local server and a web-based UI that connects to that server from your browser. This is very different from Claude Code and Cowork, where the UI is more tightly coupled to the host machine (which makes things like computer use possible).

    I think I recognize the strategy: most pharma environments connected to interesting data are tightly locked down, to the point where you can't just connect your Macbook to the source data.

    Similarly, access to large genomic biobank datasets like UK Biobank or NIH's All of Us program is granted only through a Trusted Research Environment (TRE), a remote data analysis platform usually quite restricted on internet access, etc. You can't easily run desktop apps, but these environments do usually support running JupyterLab or VS Code, tunneling the user interface through to the end user. (Source: I previously ran the team that built the All of Us TRE.)

    Claude Science looks a lot more like something one could imagine spinning up in one of those highly-constrained data environments (with the "server" running within the TRE and the UI proxied to the end user's browser) than the does-everything Claude mega-app. That will be critical for traction within pharma R&D environments.

    I will say that for moderately-computational scientists, who are daily driving RStudio, JupyterLab, or maybe VS Code, Claude Science will be quite an unfamiliar shaped product. I'll be curious to see whether something like this gains adoption (1) in place of, (2) alongside, or (3) eventually wrapping around the more traditional data science workbench tools out there.

  • PotatoFarmsKing an hour ago

    Before LLMs the tech groups I followed were ripping with discussions about this and that topic, what to use and when; I believe these discussions sparked the creation of many frameworks and tools out of "this seems like a good idea, wouldn't hurt to implement it". Unfortunately it all resolves around LLMs nowadays and how to make some LLM work some way or another, we don't even discuss the very topics the groups were created to discuss. I fear science is soon to taste the same thing - discussions about LLMs taking place instead of the actual topics that would be discussed otherwise.

  • minimaxir 3 hours ago

    When I saw "Science" I didn't think they meant Data Science, which is what the UIs full of pandas code and plots imply. Even if the focus is on the sciences, I suspect that's the less valuable part of the announcement particularly with the implication of Jupyter Notebook 2.0.

    Image-understanding for data viz is a use case that has been ignored, and modern LLMs are getting better at proper EDA. But, uh, I may need to update my resume.

    • ritzaco 3 hours ago

      A lot of the soft and hard sciences use hacky matplotlib code to produce results and visualisation, without being necessarily data science

      From the bits I've seen, I'd take claude-generated code any time over that written by maths, physics, biology, linguistics people. Even though I've seen Claude make some super-big mistakes while doing data analysis I'd guess it's already more reliable than most academics trying to code.

      • __MatrixMan__ 24 minutes ago

        Conveniently, you can use published results as tests of equivalence, provide the ugly code as context, and regenerate it to your liking. I think the odds of such a regeneration introducing a bug that's within the usage domain but that dodges the golden tests are quite low... so long as you resist the urge to add features along the way.

      • beardedwizard 2 hours ago

        This 100000x over. Nothing is worse than trying to productionize code coming from academics like this.

    • __MatrixMan__ 3 hours ago

      My take based on the video is that they're thinking more about bioinformatics, which might technically fall under the "data science" umbrella depending how you define your terms, but which is not described that way in common usage.

      It's the content that determines the sort of science, not the toolchain.

      • winwang 2 hours ago

        Honestly quite excited to see what can happen here, I think biology has generally had a lack of data science expertise.

    • quijoteuniv 2 hours ago

      All of these new things are starting to look like soviet space program propaganda. Is there something really new?

  • Recursing 2 hours ago

    This seems to have unblocked Claude Desktop for Linux ( https://code.claude.com/docs/en/desktop-linux )

    • loufe 2 hours ago

      unfortunately no arch based distro support. I'm curious why it's not packaged as a flatpak.

      • arendtio 2 hours ago

        Well, for Arch Linux, there was the unofficial version from the official binary in the AUR already... (Not sure what you mean by 'no arch based distro support').

        • loufe an hour ago

          First party support would be nice since this is not a high-trust in the AUR period, but fair point, I'll probably use it, thank you!

      • Recursing 2 hours ago

        Many deb packages are easily repackaged for arch by the community

  • qwerty_clicks an hour ago

    Should be called Claude-bio-big-bucks.

    What about earth science, physics, engineering? The connectors and skills are all just biology and pharma. Boo

  • Sol- 3 hours ago

    So it's like Claude Cowork for Science, i.e. for less tech-savvy users? I would imagine scientists with some coding background might just prefer to use Claude Code normally and integrate it with their stack of choice, but perhaps the comfort and ease of use of Claude Science still wins out.

  • raphman 3 hours ago

    tl;dr: Use this if you don't like doing science or doing things well. It hallucinates references.

    Seems to be based on https://github.com/swaruplab/operon as evidenced by the authorization dialog and https://x.com/testingcatalog/status/2037684573161783373 .

    Mostly targeted at life sciences - e.g. integration for FDA, PubMed, genomics databases but no ACM / IEEE as far as I can tell.

    Edit: arXiv search seems to be supported - but not Google Scholar etc. So, this tool is of little use for most researchers outside life sciences.

    Edit 2: Quick walkthrough: the AppImage starts a browser window with an onboarding wizard and a chat interface. It suggests a few things one might do at the start of a research project - e.g. do a quick literature review. When I chose that option, wrote Python scripts that used MCP calls to do arXiv searches. Stayed seemingly stuck there for a few minutes not returning anything. Then:

    > The free-text search returned too much noise

    Claude decided to choose a certain paper as a starting point for further research. Shortly afterwards:

    > That DOI resolved to the wrong paper. Let me find the correct anchor papers by title/author search directly.

    Then it meandered a few more minutes doing research and creating a citation graph (that it did not show to me).

    > I have a complete picture. Let me verify the key DOIs resolve and then write the review.

    Then:

    > The lint flags em-dash overuse. Let me reduce them, then save.

    Then: a nice but verbose literature overview of my chosen topic

    <blink>BUT it includes at least one hallucinated reference!</blink>

    P.S.: What does this mean?

      [reviewer] verifier_mode=default-on downgraded to off: pro subscription tier, autoReviewer withheld (frame=f2a81cb2)
    • Retr0id 2 hours ago

      > The lint flags em-dash overuse

      An explicit text desloppification pass (i.e. LLM-use obfuscation) seems like outright scientific fraud.

      • sansseriff 2 hours ago

        It sure is! But ironically, because of the intention behind the obfuscation. Not the fact that AI was used in a research paper.

        I have no issues with AI use in science. If claude can explain my research better than me, then have at it. But I do NOT want to read a passage thinking it was written by a human when it wasn't. Science has no idea yet how such disclosures should work yet. What should be done by humans as a matter of principle, and what can't be or should not be done by humans.

        • dleeftink 2 hours ago

          Some authors may even choose to leave syntactical errors as a tell for those self-authored passages; long-term, some interesting language drifts may come of it.

      • Der_Einzige an hour ago

        We send our regards: https://arxiv.org/abs/2510.15061 (ICLR 2026)

    • sampo 3 hours ago

      Biosciences mostly don't use arXiv, they have their own https://www.biorxiv.org/ but it's usage is not as common as arXiv is in e.g. physics.

  • immmmmm an hour ago

    When I was doing my phd, around 2 decades ago, I was often going to the library’s compactus to fish for a Phys Rev from the 80s. Back then papers were sparse and expensive. But the quality!

    The Higgs boson is 3 papers, 6 authors and 6 pages in total!

    At the end of my phd, 30++ pages slop papers were the norm.

    Nowadays, well..

    The paper by Higgs was one page. The guy probably published less than a hundred pages in his career.

    One reason that made me abandon a career was the disgust caused by the publishing frienzy.

    And now tokens..

    • trollbridge an hour ago

      There is an obscure topic where I have read basically every single dissertation, study, etc on that topic (or even just articles that mention it). It is very noticeable how much briefer older publications were.

      It would be impossible to do that today. I guess I could have an LLM just summarise all the papers…

      • Daishiman an hour ago

        What's the reason for this? Publish-or-perish? Papers have to be more thorough? Extra junk tacked on for the sake of showing lengthier papers?

  • jszymborski an hour ago

    Any other researchers paranoid of using LLMs for fear of them using your data and front running your publications/work?

    Or incorporating it in training data and then spitting it out to a competing lab?

    • malux85 39 minutes ago

      Pay for enterprise or use one of the guaranteed no data retention models (e.g. Bedrock)

  • cowpig 27 minutes ago

    I've always found that what science is really lacking is closed, proprietary ecosystems trying to build for-profit moats around research.

    Thank our lords at Anthropic for stepping into this void

  • theplumber an hour ago

    They forgot to include an example of prompt error on “cancer” with Fable in that “nice” video.

  • stanford_labrat 3 hours ago

    impressive to me, but sadly i feel a little misleading since this is only the data-science part of life sciences.

    every few weeks though i test claude and chatgpt on their scientific reasoning and it has definitely improved over time. in my experience without specific instruction on what is known/unknown they typically are lagging behind the leading edge of the field (dev bio/pluripotency in my case). probably because scientific research articles are not open-source so they can't crawl them.

    claude has definitely outperformed chatgpt in this regard however, it's scientific reasoning is impressive.

  • JoshGlazebrook 3 hours ago

    The fact that we are coming up on a month of Fable being unavailable with essentially zero actual signal from Anthropic around when it may be back is crazy to me. Yet still we have these random new products coming out?

    • striking 3 hours ago

      https://xcancel.com/AnthropicAI/status/2070665903440871779

      > Anthropic @AnthropicAI Jun 27, 2026 · 12:29 AM UTC

      > Since June 12, we’ve been working closely with the US government to restore access to Claude Mythos 5 and Fable 5. Today, the government notified us that Mythos 5, our strongest cybersecurity model, can be redeployed to a set of US organizations that operate and defend critical infrastructure.

      > We’re restoring access for these organizations quickly, and we’re continuing to work with the government to expand access to Mythos 5 and make Fable 5 available for general use again.

    • ianm218 3 hours ago

      I mean the company has like 3k employees or more right? Lots of them are just working on more applied AI use cases that don't require frontier AI just the right integrations and structure etc.

      Opus 4.8/ GPT 5.6 level models with the right workflows/ data/ access are still good enough to do huge amounts of economically valueable work.

  • khurs 3 hours ago

    Big Pharama = Big Budgets.

    So targeting them with a tailored product is understandable.

    • asdff an hour ago

      pharma is currently in a tailspin and not really spending money. they'd rather outsource everything possible to china or india right now.

  • domrdy 3 hours ago

    It has Sonnet 5 as a usable model. Interesting.

  • cmiles8 3 hours ago

    Science isn’t suffering from a lack of papers. It’s suffering from a lack of good papers. Making it easier to just pump out paper-mill publications is about the last thing science needs right now.

    • dgfl 2 hours ago

      My hope is that the flood of AI articles pushes the academic publication system to its highly-anticipated breaking point.

      The most absurd part is that everyone in academia knows that publish or perish is tremendously damaging to real research. Yet we’re all hostage of this system that we created in the name of “merit” and “efficiency”.

      We need a different system to identify and reward talented hard-working people. Back in the day it all relied on actual interpersonal interaction and subjective judgment, but there were also much fewer researchers worldwide.

      • dag100 2 hours ago

        > My hope is that the flood of AI articles pushes the academic publication system to its highly-anticipated breaking point.

        This will just make research inaccessible to most researchers. There is no incentive to limit publishing, at all, other than at the highest echelons. Publish or perish will just become worse. Look at what is happening to programming and extrapolate that to research work.

        And all for what? Just to keep up this facade of society until most of society can be excised, whether artificially or naturally though lack of reproduction.

      • breezybottom an hour ago

        Oh it's getting there. I've turned down several referee requests this year because the paper looks like AI slop. A lot of it seems to come from China.

    • godzillabrennus 3 hours ago

      Scientific research is suffering from a reproducibility crisis. Not a publication crisis. LLM's aren't going to solve reproducibility issues.

      • CJefferson 2 hours ago

        They are going to make it a thousands times worse.

        It wasn't perfect before, but it at least took some time to fake a paper. The problem is now people can produce a very plausible looking completely fake paper in minutes. Peer review is in the process of completely collapsing, in fact I think it's already basically done.

        The only way this might fix things is if we require all papers are completely reproducable (that doesn't help in subjects like biology of course. They can still provide all the experimental data in the rawest format possible which doesn't break any laws).

      • FeteCommuniste 3 hours ago

        The two feed into each other. "Publish or perish" ups the incentive to pump out shaky papers to pad resumes. LLMs make it easier to churn them out.

      • xpct 2 hours ago

        I'm actually quite excited for when (if) the models get good enough to start replicating compsci papers. I'd love it if there was a system which calculated a reproducibility score per-lab or per-researcher, which I could look up alongside their citation count.

        I want to see who did the hard work properly, and who focused on publishing with concealed details.

      • virissimo 2 hours ago

        It seems to me that LLM's could massively improve reproducibility issues if journals would require that the papers be reproducible by model X using a standardized prompt in < N minutes, etc...

      • nok22kon 2 hours ago

        it's suffering from having 1 million researchers, when there aren't 1 million important easy problems to solve, yet you must publish something

      • rolph 3 hours ago

        it could also be said that scientific interpretation is suffering from a framework crisis. the scientific convention of experiment, is the test of an hypothesis, as a logical construct.

        repetition of materials and methods toward reproducibility, holds far less wieght than multiple variants of process designed to test a common hypothesis resulting in agreement.[null, or failure to null]

      • messh 3 hours ago

        They're gonna worsen it

        • ianm218 3 hours ago

          Isn't this just blanket cynicism?

          In the long run conceivable we could use AI to hold papers to a much higher standard, audit all the data and code that is associated etc.

          • xpct 2 hours ago

            > audit all the data and code that is associated

            For a while now there has been very little incentive for providing these alongside the paper, and I don't see why exactly 'AI' would change this. I could even see how making it vague to be harder to test with LLMs could be profitable for citation hackers.

            • ianm218 2 minutes ago

              You can imagine using AI agents to tag papers that don’t have code or similar work attached and just filtering them out.

              The Chinese open source community has made a lot of incentive to make research reproducible for example. The most reproducible works from I.e. deepseek get widely cited and adopted.

              I don’t think we can just say “AI” and it’s fixed but with deliberate effort there’s reason to be optimistic.

          • dag100 2 hours ago

            Unless reviewing becomes more profitable than publishing, anything that makes both easier will drive one up far more than the other. And it is difficult to conceive of something that would make reviewing much easier without making publishing much easier.

            • ianm218 4 minutes ago

              Just as a counterpoint ML and AI research has become much more reproducible over time. I feel like this is relevant because ML / AI researchers are huge power users of AI tools.

              Between 2016 and 2021 the share of ML/ robotics/ AI researchers being reproducible (ie contianing code and similar instructions to reproduce) doubled [1].

              The major US labs have gone largely closed source (I.e. they no longer publish frontier research) but the Chinese ecosystem has incredibly reproducible code.

              This is field dependent obviously but I think it atleast gives reason to be optimistic.

              Yes people will churn out fake slop research, but it feels like that can be categorized and then ignored.

              [1] https://arxiv.org/pdf/2308.10008

      • mobeets 3 hours ago

        Por que no los dos? Scientific review times are up, it’s harder to find reviewers, and many reviews are AI generated anyway. Auto-generated research publications will arguably make the replication crisis worse, because there will be more slop to clog up the review system, and these papers will presumably be just as (if not more) not reproducible than human written science

      • cma 3 hours ago

        In some fields like comp sci, when code isn't given but the paper describes the approach, LLMs do help with the reproducibility crisis: you can ask it to reproduce the result through reimplementation by reading the paper.

        If it fails you may have to double check it did properly reimplement it, but if it succeeds you do get a reproduction.

  • jvanderbot 3 hours ago

    Thought I'd give it a whirl - crashed immediately.

    I was tickled they had a "Download for linux" button prominently shown, but nothing yet.

  • nickandbro 3 hours ago

    So I guess they released this instead of Sonnet 5?

  • fastaguy88 an hour ago

    Download for mac. Find out I need a different subscription. Cannot quit program (must force quit).

    Perhaps I need AI to use it.

  • imdsm 3 hours ago

    Weird that it runs as a local webserver rather than as an app

  • trallnag 2 hours ago

    "Pre-configured for your domain [...] cheminformatics" as in something like ChEMBL?

  • brcmthrowaway an hour ago

    DoA

  • tripleee 3 hours ago

    maxed out on coding improvements so now they're trying to expand to other markets

    • cma 3 hours ago

      Why have they talked about this for a long time? They predicted date of code maxing out, and did so not from fitting a sigmoid or something but they predicted it would max out right during a steep part of the slope?

  • cute_boi an hour ago

    whats up with all these samosa? Samosa Manuscript, Samosa Benchmarking?

  • dmezzetti an hour ago

    Why does HN let OpenAI and Anthropic basically advertise but it throws down the gauntlet at a small developer like myself when we do "self promotion"?

    Top 3 posts as of this moment are all about Claude.

  • ChrisArchitect 3 hours ago
  • game_the0ry 3 hours ago

    Disappointing that science came after cowork. Shows how their priorities are for profitability first and help humanity second.

    • uejfiweun 3 hours ago

      Now this... this is a hot take. How exactly do you expect these companies to "help humanity" if they're bleeding money?

  • CamperBob2 2 hours ago

    Claude: "Not that science"

  • bozdemir 3 hours ago

    Another overrated packaged workspace to drain more usage... No thank you.

  • Retr0id 2 hours ago

    > every step from data wrangling to *publication*

    Do they have no shame?

    Edit: seems like no https://news.ycombinator.com/item?id=48736814

  • calldacopsidgaf 3 hours ago

    this a great application for the sycophantic, non-deterministic lying machine!

    • thrill 2 hours ago

      It's called Claude Science, not Claude Politician.

  • bigyabai 3 hours ago

    How about no?

    AI brand identity has made the unfortunate pivot to "how much do you trust us" which is going be a real race to the bottom. I don't want LLMs managing nuclear reactors or replacing junior lab technicians. I don't trust any of these LLMs to do the bare minimum, regardless of how good it is for your brand.

    It's gross watching these stunts unfold. Next ChatGPT will fly a passenger jet, which Claude will one-up with an agentic surgery, which OpenAI will respond to by putting a humanoid robot on the moon. If this is what 21st century market competition looks like, we are all fucked.

    • torginus 3 hours ago

      Meanwhile in the real world, these Math Olympiad AIs can't even take your fast food order correctly.