Amazon to invest another $4B in Anthropic

(cnbc.com)

666 points | by swyx 3 months ago ago

382 comments

Curious if anyone knows the logistics of these cloud provider/AI company deals. In this case, it seems like the terms of the deal mean that Anthropic ends up spending most of the investment on AWS to pay for training.

Does anthropic basically get at cost pricing on AWS? If Amazon has any margin on their pricing, it seems like this $4B investment ends up costing them a lot less, and this is a nice way to turn a cap ex investment into AWS revenue.

[-]

tyre 3 months ago

Yes exactly.

This was the brilliance of the original MSFT investment into OpenAI. It was an investment in Azure scaling its AI training infra, but roundabout through a massive customer (exactly what you’d want as a design partner) and getting equity.

I’m sure Anthropic negotiated a great deal on their largest cost center, while Amazon gets a huge customer to build out their system with.

[-]

wcunning 3 months ago

That’s honestly one of the hardest things in engineering — identifying not just a customer to drive requirements, but a knowledgeable customer who can drive good requirements that work for a broader user base and can drive further expansion. Anthropic seems ideal for that, plus they act as a service/API provider on AWS.

[-]

antupis 3 months ago

Yeah this working with knowledgeable customer is like magic.

dzonga 3 months ago

or simply one of the best corporate buyouts that's not subject to regulatory scrutiny. microsoft owns 49% of OpenAI - will get profits till whenever. All without subject to regulatory approval. and they get to improve Azure

[-]

rty32 3 months ago

A caveat -- FTC is currently looking into the deal between Microsoft and OpenAI.

[-]

salad-tycoon 3 months ago

A lot could change in the future and what’s the worse they could do? A billiokn dollar fine? Bold, do it and ask for forgiveness later.

fny 3 months ago

And Amazon can always build their own LLM product down the line. Building out data centers feels like a far more difficult problem.

[-]

diggan 3 months ago

> Building out data centers feels like a far more difficult problem.

Is it really? I'm thinking it might be more time-and-money-involved than building a "LLM product" (guess you really meant models?), but in terms of experience, we (humanity) have decades of experience building data centers, while a few years (at most) experience regarding anything LLM.

[-]

eddyzh 3 months ago

You might be interested in the Nobel prizes this year for (LLM relevant) science starting 70 years ago.

2 years ago was only when it went mainstream in a chat Format.

Also, you/average user has been typing with helpt of a (smaller) language model for about 9 years on mobile.

bittermandel 3 months ago

I'd say building datacenters is a commodity these days. There's countless actors in this field who are thriving.

[-]

ralgozino 3 months ago

Absolutely, you can even buy a pre built datacenter from companies like Huawei or Schneider and get it shipped, plug power and network and be online.

whatshisface 3 months ago

This explanation makes no sense, I could be AWS' biggest customer if they wanted to pay me for it. Something a little closer could be that the big tech companies wanted to acquire outside LLMs, not quite realizing that spending $1B on training only puts you $1B ahead.

[-]

raverbashing 3 months ago

Yes but Amazon is not making extra money with you being their biggest customer

With Anthropic yes

[-]

whatshisface 3 months ago

Anthropic is getting $4B in investment in a year where their revenue was about $850M. Even if Amazon had bought them outright for that much, they would not be ahead. The fact that everybody keeps repeating the claim that Amazon is "making money" makes this appear like some kind of scam.

[-]

mbesto 3 months ago

This is not how it works.

First, revenue is irrelevant.

Second, the investment isn't a loan that they need to repay. They are getting equity.

Third, Anthropic is exclusively using AWS to train its models. Which, yes, means if AWS gives them $4B and it costs them $500M/year to pay for AWS services then after 8 years, the cash is a wash. However this ignores the second point.

Fourth, there is brand association for someone who wanted to run their own single tenant instance of Claude whereby you would say "well they train Claude on AWS, so that must be the best place to run it for our <insert Enterprise org>" similar to OpenAI on Azure.

Fifth, raising money is a signaling exercise to larger markets who want to know "will this company exist in 5 years?"

Sixth, AWS doesn't have its own LLM (relative to Meta, MS, etc.). The market will associate Claude with Amazon now.

[-]

warkdarrior 3 months ago

> Sixth, AWS doesn't have its own LLM (relative to Meta, MS, etc.). The market will associate Claude with Amazon now.

Amazon/AWS has their line of Titan LLMs: https://aws.amazon.com/bedrock/titan/

[-]

mbesto 3 months ago

Fair. I wasn't aware of that, for the same reason that if you search Titan vs Claude on HN, you'll find way more mentions of Claude:

https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...

I think its fair to say this is also a hedging strategy then.

whatshisface 3 months ago

The difference between things you'd say like "it's true that..." and "the market will associate..." basically is the definition of a scam.

[-]

Spooky23 3 months ago

It’s not a scam at all. Amazon doesn’t have an AI story. So they invest in Anthropic, get a lot of that money back as revenue that seeds demand.

Their customers now have an incentive to do AI in AWS. That drives more revenue for AWS.

[-]

donavanm 3 months ago

> Amazon doesn’t have an AI story.

A quibble: AWS _does_ have an AI story (which i was originally dismissive of): Bedrock as a common interface and platform to access your model of choice, plus niceties for fine tuning/embeddings/customization etc. Unlike say Azure theyre not betting on _a_ implementation. Theyre betting that competition/results between models will trend towards parity with limited fundamental differentiation. Its a bet on enterprises wanting the _functionality_ more generally and being able to ramp up that usage via AWS spend.

WRT titan my view is that its 1) production r&d to stay “in the game” 2) a path towards commoditization and lower structural costs, which companies will need if these capabilities are going to stick/have roi in low cost transactions.

[-]

Spooky23 3 months ago

Sure they do, but it doesn’t have a ton of traction relative to the size of AWS.

mbesto 3 months ago

Ummm okay? A scam implies someone is getting hurt (financially, emotionally, etc.). Who's getting scammed here?

[-]

whatshisface 3 months ago

The big tech companies are spending enormous amounts for part ownership in startups whose only assets are knowledge that exists in the public domain, systems that the companies could have engineered themselves, and model weights trained with the buyer's own capital. The people who will get hurt are public investors who are having their investment used to make a few startup people really rich.

[-]

prewett 3 months ago

> whose only assets are knowledge

Knowledge is quite the useful asset, and not easily obtained. People obtain knowledge by studying for years and years, and even then, one might obtain information rather than knowledge, or have some incorrect knowledge. The AI companies have engineered a system that (by your argument) distills knowledge from artifacts (books, blogs, etc.) that contain statements, filler, opinions, facts, misleading arguments, incorrect arguments, as well as knowledge and perhaps even wisdom. Apparently this takes hundreds of millions of dollars (at least) to do for one model. But, assuming they actually have distilled out knowledge, that would be valuable.

Although, since the barrier to entry is pretty low, they should not expect sustained high profits. (The barrier is costly, but so is the barrier to entry to new airlines--a few planes cost as much as an AI model--yet new airlines start up regularly and nobody really makes much profit. Hence, I conclude that requiring a large amount of money is not necessarily a barrier to entry.)

(Also, I argue that they have not actually distilled out knowledge, they have merely created a system that is about as good at word association as the average human. This is not knowledge, although it may have its own uses.)

kelnos 3 months ago

If they could build it themselves, why haven't they? Say what you want about Amazon, but I find it hard to believe that Anthropic bamboozled them into believing they can't build their own AI when they could do it cheaper.

PittleyDunkin 3 months ago

If the scam only hurts investors i'd say it's likely a net benefit to humanity.

throwup238 3 months ago

Last I checked, AWS reserve pricing for one year of an 8x H100 pod costs more than just buying the pod yourself (with tens of thousands left over per server for the NVIDIA enterprise license and to hire people to manage them). On demand pricing is even worse.

This is essentially money that they would have spent to build out their cloud anyway, except now they also get equity in Anthropic. Whether or not Anthropic survives, AWS gets to keep all of those expensive GPUs and sell them to other customers so their medium/long term opportunity cost is small. Even if the deal includes cheaper rates the hardware still amortizes over 2-3 years, and cloud providers are running plenty of 5+ year old GPUs so there's lots of money to be made in the long tail (as long as ML demand keeps up).

They're not making money yet because there's the $4 billion opportunity cost, but even if their equity in Anthropic drops to zero, they're probably still going to make a profit on the deal. If the equity is worth something, they'll make significantly more money than they could have renting servers. Throw financial engineering on top of that, and they may come out far ahead regardless of what happens to Anthropic: Schedule K capital equipment amortizations are treated differently from investments and AFAICT they can double dip since Anthropic is going to spend most of it on cloud (IANAL). That's likely why this seems to be cash investment instead of in-kind credits.

I think that’s what people mean when they say Amazon is making money off the deal. It’s not an all or nothing VC investment that requires a 2-3x exit to be profitable because the money just goes back to AWS’s balance sheet.

[-]

AgentOrange1234 3 months ago

Yes and it’s also interesting that they mention using Trainium to do the training. I don’t know how much spend that is, but it seems really interesting. Like, if you’re AWS, and you imagine competing in the long run with NVIDIA for AI chips, you need to fund all that silicon development.

[-]

axpy906 3 months ago

They mentioned that in the last investment too. That seems like marketing to me as no one is doing bleeding edge research outside of the NVIDIA CUDA run ecosystem.

wepple 3 months ago

Wow, I had no idea Anthropic was doing $850m revenue.

I know they have high costs, but as a startup that’s some phenomenal income and validation that they’re not pure speculation like most startups are

Edit: founded in 2021 and with 1000 employees. That’s just wild growth.

phillipcarter 3 months ago

This is a way to keep the money printer called AWS Bedrock going and going and going. Don't underestimate the behemoth enterprises in the AWS rolodex who are all but assured to use that service for the next 5+ years at high volume.

vineyardmike 3 months ago

These sort of investments usually also contain licensing deals.

Amazon probably gets Anthropic models they can resell “for free”. The 850M revenue is Anthropic’s, but there is incremental additional revenue to AWS’s hosted model services. AWS was already doing lots of things with Anthropic models, and this may alter the terms more in amazons favor.

Are they actually making money? I don’t know, investments aren’t usually profitable on day one. Is this an opportunity for more AWS revenue in the future? Probably.

[-]

celestialcheese 3 months ago

And access to use anthropics models internally, where you have some guarantees and oversight that your corp and customer data aren't leaking where you don't want it to.

3 months ago

[deleted]

surgical_fire 3 months ago

It appears to be a scam because it sort of is.

AI needs to be propped up because the bug tech cloud providers they depend on need AI to be a thing to justify their valuations. Tech is going through a bit of a slump where all things being hyped a few years ago sort of died down (crypto? VR? Voice assistants? Metaverse?). Nobody gets very hyped about any of those nowadays. I am probably forgetting a couple of hyped things that fizzled out over the years.

Case in point, as much as I despise Apple, they are not all-in the AI bandwagon because it does nothing for them.

[-]

vineyardmike 3 months ago

Go look at earnings reports for big tech companies. AI is definitely driving incremental revenue.

Apple is definitely on the AI bandwagon, they just have a different business model and they’re very disciplined. Apple tends not to increase research and investment costs faster than revenue growth. You’ll also notice rumors that they’re lowering their self driving car and VR research goals.

[-]

surgical_fire 3 months ago

> Go look at earnings reports for big tech companies. AI is definitely driving incremental revenue.

Yes. Which proves my point.

[-]

vineyardmike 3 months ago

Google Cloud revenue up 35% thanks to AI products [1,4,5]. Azure sales by a similar amount (but only 12% was AI products [2]. AWS is up too [3].

In so glad your point was that it’s not a scam, and there are billions of dollars in real sales occurring at a variety of companies. It’s amazing what publicly traded companies disclose if we only bother to read it. I’m glad we’re all not in the contrarian bubble where we have to hate anything with hype.

1. https://technologymagazine.com/articles/how-ai-surged-google...

2. https://siliconangle.com/2024/10/30/microsofts-ai-bet-pays-o...

3. https://www.ciodive.com/news/AWS-cloud-revenue-growth-AI-dem...

4. https://www.reuters.com/technology/google-parent-alphabet-be...

5. https://fortune.com/2024/10/29/google-q3-earnings-alphabet-s...

[-]

surgical_fire 3 months ago

> In so glad your point was that it’s not a scam

Except it sort of is. It needs AI to be hyped and propped up, so that all those silly companies spending in GCP can continue to do so for a wee bit longer.

[-]

dartos 3 months ago

I don’t know if that makes it a scam.

I think you’re putting the cart before the horse.

Big cloud providers will push anything that would make them money. That’s just what marketing is.

AI was exciting long before big cloud providers even existed. Once it was clear that a product could be made, they started marketing it and selling the compute needed.

What’s the scam?

[-]

jacobsimon 3 months ago

I think the implication of the top comment is that cloud providers are buying revenue. When we say that cloud provider revenue is "up due to AI", a large part of that growth may be their own money coming back to them through these investments. Nvidia has been doing the same thing, by loaning data centers money to buy their chips. Essentially these companies are loaning each other huge sums of money and representing the resulting income as revenue rather than loan repayments.

To be clear, it's not to say that AI itself is a scam, but that the finance departments are kind of misrepresenting the revenue on their balance sheets and that may be security fraud.

surgical_fire 3 months ago

Crypto was exciting too. And metaverse. And VR. And voice assistants. Et cetera and so forth.

All those things would change the world, and nothing would ever be the same, and would disrupt everything. Except they wouldn't and they didn't.

The scam is that those companies don't want to be seen as mature companies, they need to justify valuations of growth companies, forever. So something must always go into the hype pyre.

By all means, I hope the scam goes on for longer, as it indirectly benefits me too. But I don't have it in my heart to be a hypocrite. I will call a pig a pig.

[-]

dartos 3 months ago

I think AI isn’t the same as crypto or metaverse.

The LLMs and image generation models have obvious utility. They’re not AGI or anything wild like that, but they are legitimately useful, unlike crypto.

VR didn’t fail, it just wasn’t viral. Current VR platforms are still young. The internet commercially failed in 2001, but look at it now.

Crypto the industry, imo, is a big pyramid scheme. The technology has some interesting properties, but the industry is scammy for sure.

Metaverse wasn’t even an industry, it was a buzzword for MMOs during a time when everyone was locked at home. Not really interesting.

I don’t think it’s wise to lump every market boom together. Not everything is a scam.

fakedang 3 months ago

People are losing jobs because of AI. Like it or not, as imperfect as AI may be, AI is having a real world disruptive impact, however negative it may be. Customer service teams and call centers are already being affected by AI, and if they aren't being smart about it, being rendered obsolete.

A lot of folks here seem to look at AI through examples of YC companies apparently. Step back and look instead at the kind of projects technology consultancies are taking up instead - they are real world examples of AI applications, many of which don't even involve LLMs but other aspects such as TTS/STT, image generation, transcription, video editing, etc. Way too many freelancers have begun complaining about how their pipelines have been zilch in the past two years.

[-]

surgical_fire 3 months ago

That was, perhaps, the only good retort made so far. Yes, call centers and customer service is being affected, although it is unclear to me if the cost-benefit make sense when AI stops being heavily subsidized - I may be wrong, but my impression is that AI companies bleed money not only with training, but in running the models, and the actual cost of those services for it to make sense will need to be substantially higher than they are right now.

[-]

MVissers 3 months ago

Price dropping is just a matter of time. Compute gets cheaper and the models get better. We’ve seen 100x drop in price for same capabilities in ~2 years.

Don’t forget about writers and designers losing jobs as well. If you’re not absolute top and don’t use AI, AI will replace you.

dartos 3 months ago

There are also a lot of macroeconomic changes making hiring contractors (or anyone, really) a less attractive option at least in the US.

herval 3 months ago

> Case in point, as much as I despise Apple, they are not all-in the AI bandwagon because it does nothing for them.

not sure if you've been paying attention, but AI is literally _the only thing_ Apple talks about these days. They literally released _an entire generation of devices_ where the only new thing is "Apple Intelligence"

[-]

staticman2 3 months ago

Is Apple investing in AI as much as Google, Meta, Microsoft, and xAI? If not they are not "all in".

[-]

adgjlsfhk1 3 months ago

they are investing differently. Apple has a much more captive audience than the others, and as such is focused on AI services that can be run on device. as such, they aren't doing the blessing edge foundation modern research, but instead putting a ton of money into productionizing smaller models that can be run without giant cloud compute.

herval 3 months ago

They don’t disclose it, but I’d imagine so. They also admit to being a couple of years late, so they’re accelerating (as per their last earnings call)

herval 3 months ago

Trivia: not sure if you’re aware, but there’s billion dollar companies in all these spaces you claim “nobody cares about”. Every single stock broker in the US trades crypto now. Omniverse earns Nvidia a ton of money, Apple earned a billion dollars with a clunky v1 and Meta is selling more and more Quests every half.

[-]

asadotzler 3 months ago

Apple has spent over $10B on AVP and made back less than 10% of that with no signs of improvement in the next year or two and continued big spending on dev and content.

Meta has spent over $50B on Quest and the Metaverse with fewer than 10M MAU to show for it.

If you think those are successes, I'll go out and get several bridges to sell you. Meet me here tomorrow with cash.

surgical_fire 3 months ago

Yeah. Was not really a world changer as it was claimed to be during hype cycle.

Billion dollar valuation for a conpany in a given space is not as impressive as you think it is. Do I need to mention some high profile companies with stellar valuations that are sort of a joke now? We can work together on this ;)

eitally 3 months ago

I am not privy to specific details, but in general there is a difference between investment and partnership. If it's literally an investment, it can either be in cash or in kind, where in kind can be like what MSFT did for OpenAI, essentially giving them unlimited-ish ($10b) Azure credits for training ... but there was quid pro quo where MSFT in turn agreed to embed/extend OpenAI in Azure services.

If it's a partnership investment, there may be both money & in-kind components, but the money won't be in the context of fractional ownership. Rather it would be partner development funds of various flavors, which are usually tied to consumption commits as well as GTM targets.

Sometimes in reading press releases or third party articles it's difficult to determine exactly what kind of relationship the ISV has with the CSP.

B4CKlash 3 months ago

There's also another angle. During the call with Lex last week, Dario seemed to imply that future models would run on amazon chips from Annapurna Labs (Amazon's 2015 fabless purchase). Amazon is all about the flywheel + picks and shovels and I, personally, see this as the endgame. Create demand for your hardware to reduce the per unit cost and speed up the dev cycle. Add the AWS interplay and it's a money printing machine.

shawndrost 3 months ago

You can find the text of the original OpenAI/MSFT deal here: https://www.lesswrong.com/posts/5jjk4CDnj9tA7ugxr/openai-ema...

scosman 3 months ago

Also: they need top tier models for their Bedrock business. They are one of only a few providers for Claude 3.5 - it’s not open and anthropic doesn’t let many folks run it.

Google has Gemini (and Claude), MSFT has OpenAI. Amazon needs this to stay relevant.

chatmasta 3 months ago

Supermicro is currently under DOJ investigation for similar schemes to this. The legality of it probably depends on the accounting, and how revenue is recognized, etc.

It certainly looks sketchy. But I’m sure there’s a way to do it legitimately if their accountants and lawyers are careful about it…

DAGdug 3 months ago

This assumes they have no constraint when it comes to supply, and therefore no opportunity cost.

dustingetz 3 months ago

i believe they get to book any investment of cloud credits as revenue, here’s a good thread explaining the grift: https://news.ycombinator.com/item?id=39456140 basically you’re investing your own money in yourself which mostly nets out but you get to keep the equity (and then all the fool investors FOMO in on fake self dealing valuations)

yard2010 3 months ago

...isn't it tax fraud with extra steps? Asking seriously.

aiinnyc 3 months ago

One hand washes the other.

paulddraper 3 months ago

Correct.

Same with Microsoft.

peppertree 3 months ago

Anthropic should double down on the strategy of being the better code generator. No I don't need an AI agent to call the restaurant for me. Win the developers over and the rest will follow.

[-]

rtsil 3 months ago

> Win the developers over and the rest will follow.

Will they really? Anecdotal evidence, but nobody I know in real life knows about Claude (other than it's an ordinary first name). And they all use or at least know about ChatGPT. None of them are software engineers of course. But the corporate deciders aren't software engineers either.

[-]

999900000999 3 months ago

Normal people aren't paying for LLMs.

If they ever do Apple and Google will offer it as a service built into your phone .

For example, you could say ok Google call that restaurant me and My girlfriend had our first date at 5 years ago, set up something nice so I can propose. And I guess Google Gemini ( or whatever it's called at this point), Will hire a band, some photographers, and maybe even a therapist just in case it doesn't work out.

All of this will be done seamlessly.

But I don't imagine any normal person will pay 20 or $30 a month for a standalone service doing this. As is it's going to be really hard to compete against GitHub Copilot they effectively block others from scrapping GitHub.

[-]

dymk 3 months ago

But why hire a therapist when Gemini is there to talk to?

Re: Github Copilot: IME it's already behind. I finally gave Cursor a try after seeing it brought up so often, and its suggestions and refactors are leagues ahead of what Copilot can do.

[-]

hiatus 3 months ago

> But why hire a therapist when Gemini is there to talk to?

Well for one, there's no doctor patient confidentiality.

maeil 3 months ago

It is behind, but I think that's intentional. They can simply wait and see which of the competing VSCode AI forks/extensions gains the most traction and then acquire them or just imitate and improve. Very little reason to push the boundaries for them right now.

RamblingCTO 3 months ago

Because the most important part of therapy for a lot of things is the human connection, not so much the knowledge. Therapy is important, the US system is just stupid

sumedh 3 months ago

> As is it's going to be really hard to compete against GitHub Copilot they effectively block others from scrapping GitHub.

Hire 1000 people in India to do it then?

[-]

Cumpiler69 3 months ago

AI = Actually Indian

maeil 3 months ago

> Normal people aren't paying for LLMs.

I know relatively "normal" people with no interest in software who pay for ChatGPT.

[-]

wokwokwok 3 months ago

Most. Most normal people.

Sure I know people who pay for it too; but I know a lot of people who like free things and don’t or can’t pay for subscriptions.

Do you think most people have a spare $30 to spend every month on something they already get for free?

At the moment? I don’t.

[-]

hiatus 3 months ago

The parent did not say "most normal people".

[-]

wokwokwok 3 months ago

The parent said “normal people”; that’s the majority, by definition. That’s what “most people” is.

datavirtue 3 months ago

Yeah, it's table stakes.

peppertree 3 months ago

Consumers don't have to consciously choose Claude, just like most people don't know about Linux. But if they use an Android phone or ever use any web services they are using Linux.

findjashua 3 months ago

Every single person I know who pays for an LLM is a developer who pays for Claude because of coding ability

[-]

hackernewds 3 months ago

Most people you know probably also voted for the Democratic candidate. Selection bias especially on HN is strong.

[-]

findjashua 3 months ago

wrong

EVa5I7bHFq9mnYK 3 months ago

I pay for both Claude and Chatgpt, chatgpt codes better, especially the slow version.

hobofan 3 months ago

Every single business I know that pays for LLMs (on the order of tens of thousands of individual ChatGPT subscriptions) pay for whatever the top model is in their general cloud of choice with next to no elasticity. e.g. a company already committed to Azure will use the Azure OpenAI models and a customer already commited to AWS will use Claude.

staticman2 3 months ago

Most people I know in real life have certainly heard of ChatGPT but don't pay for it.

I think someone enthusiastic enough to pay for the subscription is more likely to be willing to try a rival service, but that's not most people.

Usually when these services are ready to grow they offer a month or more free to try, at least that's what Google has been doing with their Gemini bundle.

[-]

hiq 3 months ago

I'm actually baffled by the number of people I've met who pay for such services, when I can't tell the difference between the models available within one service, or between one service or the other (at least not consistently).

I do use them everyday, but there's no way I'd pay $20/month for something like that as long as I can easily jump from one to the other. There's no guarantee that my premium account on $X is or will remain better than a free account on $Y, so committing to anything seems pointless.

I do wonder though: several services started adding "memories" (chunks of information retained from previous interactions), making future interactions more relevant. Some users are very careful about what they feed recommendation algorithms to ensure they keep enjoying the content they get (another behavior I'm was surprised by), so maybe they also value this personalization enough to focus on one specific LLM service.

[-]

diego_sandoval 3 months ago

The amount of free chats you get per day is way too limiting for anyone who uses LLMs as an important tool in their day job.

20 USD a month to make me between 1.5x and 4x more productive in one of the main tasks of my job really is a bargain, considering that 20 USD is very small fraction of my salary.

If I didn't pay, I'd be forced to wait, or create many accounts and constantly switch between them, or be constantly copy-pasting code from one service to the other.

And when it comes to coding, I've found Claude 3.5 Sonnet better than ChatGPT.

[-]

mplewis 3 months ago

[flagged]

[-]

a1j9o94 3 months ago

If you aren't using LLMs for most knowledge work you're probably wasting time somewhere.

[-]

gloflo 3 months ago

If you are using LLMs and call the result knowledge you're probably not acting ethically.

[-]

sebastiennight 3 months ago

Yesterday I needed to take an unstructured document with about 1,200 timestamps and substract 1 second 550ms from each of those.

I could have written code for it, but Claude output a perfectly valid HTML page I could locally paste my document in, which gave me the accurate output I needed.

This is knowledge work.

Today I had another document, about the length of a small book, where H3 and H4 titles were mistakenly provided in the wrong language. I needed those 159 titles to be changed while preserving the rest of the document, with a very specific maximum word count per title. Claude did this with a single natural language prompt. (though I had to tell it to "go on" every couple hundred lines)

This is also knowledge work. Knowledge work is not generating new knowledge, just like manual work isn't about generating new hands.

bambax 3 months ago

When used via OpenRouter (or the like?) the costs are ridiculously low and you have immediate access to 200+ models that you can compare seamlessly.

datavirtue 3 months ago

Chat assistants are table stakes. No individuals will be paying for these.

Search for bing, get to work.

ramraj07 3 months ago

OP and the people who reply to you are perfect examples of engineers being clueless about how the rest of the world operates. I know engineers who don’t know Claude, and I know many, many regular folk who pay for ChatGPT (basically anyone who’s smart and has money pays for it). And yet the engineers think they understand the world when in reality they just understand how they themselves work best.

ToDougie 3 months ago

I use Claude Pro paid version every day, but not for coding. I used to be a software engineer, but no longer. I tried OpenAI in the past, but I did not enjoy it. I do not like Sam Altman.

My use cases: Generating a business plan, podcast content, marketing strategies, sales scripts, financial analyses, canned responses, and project plans. I also use it for general brainstorming, legal document review, and so many other things. It really feels like a super-assistant.

Claude has been spectacular about 98% of the time. Every so often it will refuse to perform an action - most recently it was helping me research LLC and trademark registrations, combined with social media handles (and some deviations) and web URL availability. It would generate spectacular reports that would have taken me hours to research, in minutes. And then Claude decided that it couldn't do that sort of thing, until it could the next day. Very strange.

I have given Gemini (free), OpenAI (free and Paid), Copilot (free), Perplexity (free) a shot, and I keep coming back to Claude. Actually, Copilot was a pretty decent experience, but felt the guardrails too often. I do like that Microsoft gives access to Dall-E image generation at no cost (or maybe it is "free" with my O365 account?). That has been helpful in creating simple logo concepts and wireframes.

I run into AI with Atlassian on the daily, but it sucks. Their Confluence AI tool is absolute garbage and needs to be put down. I've tried AI tools that Wix, Squarespace, and Mira provide. Those were all semi-decent experiences. And I just paid for X Premium so I can give Grok a shot. My friend really likes it, but I don't love the idea of having to open an ultra-distracting app to access it.

I'm hoping some day to be like the wizards on here who connect AI to all sorts of "things" in their workflows. Maybe I need to learn how to use something like Zapier? If I have to use OpenAI with Zapier, I will.

If you read this far, thanks.

[-]

Deegy 3 months ago

I also prefer Claude after trying the same options as you.

That said I can't yet confidently speak to exactly why I prefer Claude. Sometimes I do think the responses are better than any model on ChatGPT. Other times I am very impressed with chatGPT's responses. I haven't done a lot of testing on each with identical prompt sequences.

One thing I can say for certainty is that Claude's UI blows chatGPT's out of the water. Much more pleasant to use and I really like Projects and Artifacts. It might be this alone that has me biased towards Claude. It makes me think that UI and additional functionality is going to play a much larger role in determining the ultimate winner of the LLM wars than current discussions give it credit for.

datavirtue 3 months ago

I have been flogging the hell out of copilot for equities research and to teach me about finance topics. I just bark orders and it pumps out an analysis. This is usually so much work, even if you have a service like finviz, Fidelity or another paid service.

Thirty seconds to compare 10yrs of 10ks. Good times.

teaearlgraycold 3 months ago

They’ll use whatever LLM is integrated into the back end of their apps. And the developers have the most sway over that.

croes 3 months ago

Maybe the software engineers should talk to the deciders then.

bambax 3 months ago

In my experience*, for coding, Sonnet is miles above any model by OpenAI, as well as Gemini. They're all far from perfect, but Sonnet actually "gets" what you're asking, and tries to help when it fails, while the others wander around and often produce dismal code.

* Said experience is mostly via OpenRouter, so it may not reflect the absolute latest developments of the models. But there at least, the difference is huge.

fullstackwife 3 months ago

I also don't understand the idea of voice mode, or agent controller computer. Maybe it is cool to see as a tech demo, but all I really want is good quality, at reasonable price for the LLM service

[-]

lxgr 3 months ago

I think voice mode makes significantly more sense when you consider people commuting by car by themselves every day.

Personally I don't (and I'd never talk to an LLM on public transit or in the office), but almost every time I do drive somewhere, I find myself wishing for a smarter voice-controlled assistant that would allow me to achieve some goal or just look up some trivia without ever having to look at a screen (phone or otherwise).

[-]

MrsPeaches 3 months ago

This is the direction I am building my personal LLM based scripts. I don’t really know any python but Claude has written python scripts that e.g. write a document iteratively using LLMs. Next step will be to use voice and autogpt to do things that I would rather dictate to someone. E.g. find email from x => write reply => edit => send

Much more directed/almost micro managing but it’s still quicker than me clicking around (in theory).

Edit: I’m interested to explore how much better voice is as an input (vs writing as an input)

To me, reading outputs is much more effective than listening to outputs.

fullstackwife 3 months ago

this is noble reasoning: using cell phone while driving is a bad idea, high five!

but isn't voice mode a reminiscence of the "faster horses"?

wenc 3 months ago

Voice mode can be useful when you're reading a (typically non-fiction) book and need to ask the LLM to check something.

It's essentially a hands-free assistant.

YZF 3 months ago

Developers, developers, developers!

More seriously: I think there are a ton of potential applications. I'm not sure that developers that use AI tools are more likely to build other AI products - maybe.

[-]

yard2010 3 months ago

Reference for the memberberries: https://youtu.be/Vhh_GeBPOhs

zamderax 3 months ago

No they should not do this. They are trying to create generalized artificial intelligence not a specific one. Let the cursor, zed, codeium or some smaller company focus on that.

[-]

rty32 3 months ago

I wonder at OpenAI, Anthropic etc, how many people actually believe in "creating generalized artificial intelligence"

[-]

socksy 3 months ago

N.B. that the ordering matters here — Generalized Artificial Intelligence is not the same thing as Artificial General Intelligence

paxys 3 months ago

Which use case do you think benefits more regular customers around the world?

[-]

hehehheh 3 months ago

Which use case generates more revenue? (Genuine question. It could be the restaurants but how to monitize)

ianmcgowan 3 months ago

I mean, look at Linux and Firefox!

[-]

gopalv 3 months ago

> look at Linux and Firefox!

AI models are more like a programming language or CPU architecture.

OpenAI is Intel and Anthropic is AMD.

peppertree 3 months ago

Pretty sure most frontend developers use Chrome since it has better dev tools. And yes everyone uses Linux most just don't know it.

ripped_britches 3 months ago

Legendary comment, bravo

cainxinth 3 months ago

They certainly need the money. The Pro service has been running in limited mode all week due to being over capacity. It defaults to “concise” mode during high capacity but Pro users can select to put it back into “Full Response.” But I can tell the quality drops even when you do that, and it fails and brings up error messages more commonly. They don’t have enough compute to go around.

[-]

jmathai 3 months ago

I’ve been using the API for a few weeks and routinely get 529 overloaded messages. I wasn’t sure if that’s always been the case but it certainly makes it unsuitable for production workloads because it will last hours at a time.

Hopefully they can add the capacity needed because it’s a lot better than GPT-4o for my intended use case.

[-]

rmbyrro 3 months ago

Sonnet is better than 4o for virtually all use cases.

The only reason I still use OpenAI's API and chatbot service is o1-preview. o1 is like magic. Everything Sonnet and 4o do poorly, o1 solves like a piece of cake. Architecting, bug fixing, planning, refactoring, o1 has never let me know on any 'hard' task.

A nice combo is have o1 guiding Sonnet. I ask o1 to come up with a solution and explanation, then simply feed its response into Sonnet to execute. That running on Aider really feels like futuristic stuff.

[-]

gcanko 3 months ago

Exactly my experience as well. Like Sonnet can help me in 90% of the cases but there are some specific edge cases where it struggles that o1 can solve in an instant. I kinda hate it because of having to pay for both of them.

[-]

andresgottlieb 3 months ago

You should check out Librechat. You can connect different models to it and, instead of paying for both subscriptions, just buy credits for each API.

[-]

cruffle_duffle 3 months ago

> just buy credits for each API

I’ve always considered doing that but do you come out ahead cost wise?

[-]

esperent 3 months ago

I've been using Claude 3.5 over API for about 4 months on $100 of credit. I use it fairly extensively, on mobile and my laptop, and I expected to run out of credit ages ago. However, I am careful to keep chats fairly short as it's long chats that eat up the credit.

So I'd say it depends. For my use case it's about even but the API provides better functionality.

joseda-hg 3 months ago

How does the cost compare?

rjh29 3 months ago

I use tabnine, it let's you switch models.

hirvi74 3 months ago

I alluded to this in another comment, but I have 4o to be better than Sonnet in Swift, Obj-C, and Applescript. In my experiences, Claude is worse than useless with those three languages when compared to GPT. Everything else, I'd say the differences haven't been too extreme. Though, o1-preview absolutely smokes both in my experiences too, but it isn't hard for me to hit it's rate limit either.

[-]

versteegen 3 months ago

Interesting. I haven't compared with 4o or GPT4, but I found DeepSeek 2.5 seems to be better than Claude 3.5 Sonnet (new) at Julia. Although I've seen both Claude and DeepSeek make the exact same sequence of errors (when asked about a certain bug and then given the same reply to their identical mistakes) that shows they don't fully understand the syntax for passing keyword arguments to Julia functions... wow. It was not some kind of tricky case or relevant to the bug. Must have same bad training data. Oops, that's diversion. Actually they're both great in general.

[-]

hirvi74 3 months ago

I can see what you mean by LLMs making the same mistakes. I had that experience with both GPT and Claude, as well.

However, I found that GPT was better able to correct its mistakes while Claude essentially just doubles down and keeps regurgitating permutations of the same mistakes.

I can't tell you how many times I have had Claude spit out something like, "Use the Foobar.ToString() method to convert the value to a string." To which I reply, something like, "Foobar does not have a method 'ToString()'."

Then Claude will say something like, "You are right to point out that Foobar does not have a .ToString method! Try Foobar.ConvertToString()"

At that point, my frustration levels start to rapidly increase. Have you had experiences like that with Claude or DeepSeek? The main difference with GPT is that GPT tends to find me the right answer after a bit of back-and-forth (or at least point me in a better direction).

rafaelmn 3 months ago

Having used o1 and Claude through Copilot in VSC - Claude is more accurate and faster. A good example is the "fix test" feature is almost always wrong with o1, Claude is 50/50 I'd say - enough to try. Tried on Typescript/node and Python/Django codebases.

None of them are smart enough to figure out integration test failures with edge cases.

AlexAndScripts 3 months ago

Amazon Bedrock supports Claude 3.5, and you can use inference profiles to split it across multiple regions. It's also the same price.

For my use case I use a hybrid of the two, simulating standard rate limits and doing backoff on 529s. It's pretty reliable that way.

Just beware that the European AWS regions have been overloaded for about a month. I had to switch to the American ones.

shmatt 3 months ago

in the beginning i was agitated by Concise and would move it back manually. But then I actually tried it, I asked for SQL and it gave me back SQL and 1-2 sentences at most

Regular mode gives SQL and entire paragraphs before and after it. Not even helpful paragraphs, just rambling about nothing and suggesting what my next prompt should be

Now I love concise mode, it doesn't skimp on the meat, just the fluff. Now my problem is, concise only shows up during load. Right now I can't choose it even if i wanted to

[-]

cruffle_duffle 3 months ago

Totally agree. I wish there was a similar option on ChatGPT. These things are seemingly trained to absolutely love blathering on.

And all that blathering eats into their precious context window with tons of repetition and little new information.

[-]

therein 3 months ago

Oh you are asking for a 2 line change? Here is the whole file we have been working on with a preamble and closing remarks, enjoy checking to see if I actually made the change I am referring to in my closing remarks and my condolences if our files have diverged.

[-]

cruffle_duffle 3 months ago

You know the craziest thing I’ve seen ChatGPT do is claim to have made a change to my terraform code acting all “ohh here is some changes to reflect all the things you commented on” and all it did was change the comments.

It’s very bizarre when it rewrites the exact same code a second or third time and for some reason decides to change the comments. The comments will have the same meaning but will be slightly different wording. I think this behavior is an interesting window into how large language models work. For whatever reason, despite unchanging repetition, the context window changed just enough it output a statistically similar comment at that juncture. Like all the rest of the code it wrote out was statistically pointing the exact same way but there was just enough variance in how to write the comment it went down a different path in its neural network. And then when it was done with that path it went right back down the “straight line” for the code part.

Pretty wild, these things are.

[-]

pertymcpert 3 months ago

I don't think the context window has to change for that to happen. The LLMs don't just pick the most likely next token, it's sampled from the distribution of possible tokens so on repeat runs you can get different results.

dimitri-vs 3 months ago

Probably an overcorrection from when people were complaining very vocally about ChatGPT being "lazy" and not providing all the code. FWIW I've seen Claude do the same thing when asked do debug something it obviously did not know how to fix it would just repeatedly refactor the same sections of code and making changes to comments.

[-]

cruffle_duffle 3 months ago

I feel like “all the code” and “only the changes” needs to be an actual per chat option. Sometimes you want the changes sometimes you want all the code and it is annoying because it always seems to decide it’s gonna do the opposite of what you wanted… meaning another correction and thus wasted tokens and context. And even worse it pollutes your scroll back with noise.

nmfisher 3 months ago

Agree, concise mode is much better for code. I don’t need you to restate the request or summarize what you did. Just give me the damn code.

[-]

johnisgood 3 months ago

An alternative way to the Concise mode would be to add that (or those) sentence(s) yourself, I personally tell it to not give me the code at all at times, and at another times I want the code only, and so forth.

You could add these sentences as project instructions, for example, too.

el_benhameen 3 months ago

Interesting. I also find it frustrating to be rate limited/have responses fail when I’m paying for the product, but I’ve actually found that the “concise” mode answers have less fluff and make for faster back and forth. I’ve once or twice looked for the concise mode selector when the load wasn’t high.

[-]

rvz 3 months ago

All that money and talk of "scale" and yet not only it is slow but costs billions a year to run at normal load and is struggling at high load.

This is essentially Google-level load and they can't do it.

johnisgood 3 months ago

Agreed, I was surprised by it after I first have subscribed to Pro and had a not-that-long chat with it.

moffkalast 3 months ago

Their shitty UI is also not doing them any infrastructure favors, during load it'll straight up write 90% of an answer, and then suddenly cancel and delete the whole thing, so you have to start over and waste time generating the entire answer again instead of just continuing for a few more sentences. It's like a DDOS attack where everyone gets preempted and immediately starts refreshing.

[-]

wis 3 months ago

Yes! It's infuriating when Claude stops generating mid response and deletes the whole thread/conversation. Not only you lose what it has generated so far, which would've been at least somewhat useful, but you also lose the prompt you wrote, which could've taken you some effort to write.

cma 3 months ago

> But I can tell the quality drops even when you do that

Dario said in a recent interview that they never switch to a lower quality model in terms of something with different parameters during times of load. But he left room for interpretation on whether that means they could still use quantization or sparsity. And then additionally, his answer wasn't clear enough to know whether or not they use a lower depth of beam search or other cheaper sampling techniques.

He said the only time you might get a different model itself is when they are A-B testing just before a new announced release.

And I think he clarified this all applied to the webui and not just the API.

(edit: I'm rate limited on hn, here's the source in reply to the below https://www.youtube.com/watch?v=ugvHCXCOmm4&t=42m19s )

[-]

dr_dshiv 3 months ago

Rate limited on hn! Share more please

[-]

cma 3 months ago

https://news.ycombinator.com/item?id=34129956

avarun 3 months ago

Source?

nowahe 3 months ago

I've had it refuse to generate a long text response (I was trying to concise a 300kb documentation to 20-30kb to be able to put it in the project's context), and every time I asked it replied "How should structure the results ?", "Shall I go ahead with writing the artifacts now ?", etc.

It wasn't even during the over-capacity event I don't think, and I'm a pro user.

[-]

Filligree 3 months ago

Hate to be that guy, but did you tell it up front not to ask? And, of course, in a long-running conversation it's important not to leave such questions in the context.

[-]

nowahe 3 months ago

The weird thing is that when I tried to tell it to distill it to a much smaller message it had no problem outputting it without any followup questions. But when I edited my message to ask it to generate a larger response, then I got stuck in the loop of it asking if I was really sure or telling me that `I apologize, but I noticed this request would result in a very large response.`

It sparks me as odd, because I've had quite a few times where it would generate me a response over multiple messages (since it was hitting its max message length) without any second-guessing or issue.

neya 3 months ago

I am a paying customer with credits and the API endpoints rate-limited me to the point where it's actually unusable as a coding assistant. I use a VS Code extension and it just bailed out in the middle of a migration. I had to revert everything it changed and that was not a pleasant experience, sadly.

[-]

square_usual 3 months ago

When working with AI coding tools commit early, commit often becomes essential advice. I like that aider makes every change its own commit. I can always manicure the commit history later, I'd rather not lose anything when the AI can make destructive changes to code.

[-]

webstrand 3 months ago

I can recommend https://github.com/tkellogg/dura for making auto-commits without polluting main branch history, if your tool doesn't support it natively

teaearlgraycold 3 months ago

Why not just continue the migration manually?

htrp 3 months ago

Control your own inference endpoints.

[-]

its_down_again 3 months ago

Could you explain more on how to do this? e.g if I am using the Claude API in my service, how would you suggest I go about setting up and controlling my own inference endpoint?

[-]

handfuloflight 3 months ago

You can't. He means by using the open source models.

datavirtue 3 months ago

Runa local LLM tuned for coding on LM Studio. It has a server and provides endpoints.

datavirtue 3 months ago

You aren't running against a local LLM?

[-]

TeMPOraL 3 months ago

That's like asking if they aren't paying the neighborhood drunk with wine bottles for doing house remodeling, instead of hiring a renovation crew.

[-]

rybosome 3 months ago

That’s funny, but open weight, local models are pretty usable depending on the task.

[-]

TeMPOraL 3 months ago

You're right, but that's also subject to compute costs and time value of money. The calculus is different for companies trying to exploit language models in some way, and different for individuals like me who have to feed the family before splurging for a new GPU, or setting up servers in the cloud, when I can get better value by paying OpenAI or Claude a few dollars and use their SOTA models until those dollars run out.

FWIW, I am a strong supporter of local models, and play with them often. It's just that for practical use, the models I can run locally (RTX 4070 TI) mostly suck, and the models I could run in the cloud don't seem worth the effort (and cost).

[-]

alwayslikethis 3 months ago

For the money for a 4070ti, you could have bought a 3090, which although less efficient, can run bigger models like Qwen2.5 32b coder. Apparently it performs quite well for code

rjh29 3 months ago

I guess the cost model doesn't work because you're buying gpu that you use about 0.1% of the day

neumann 3 months ago

That's what my grandma did in the village in Hungary. But with schnapps. And the drunk was also the professional renovation crew.

rty32 3 months ago

Not everyone has a 4090 or M4 Max at home.

0xDEAFBEAD 3 months ago

More evidence that people should use wrappers like OpenRouter and litellm by default? (Makes it easy to change your choice of LLMs, if one is experiencing problems)

llm_trw 3 months ago

Neither does OAI. Their service has been struggling for more than a week now. I guess everyone is scrambling after the new qwen models dropped and matched the current state of the art with open weights.

sbuttgereit 3 months ago

Hmmm... I wonder if this is why some of the results I've gotten over the past few days have been pretty bad. It's easy to dismiss poor results on LLM quality variance from prompt to prompt vs. something like this where the quality is actively degraded without notification. I can't say this is in fact what I'm experience, but it was noticeable enough I'm going to check.

[-]

jmathai 3 months ago

Never occurred to me that the response changes based on load. I’ve definitely noticed it seems smarter at times. Makes evaluating results nearly impossible.

[-]

kridsdale1 3 months ago

My human responses degrade when I’m heavily loaded and low on resources, too.

[-]

TeMPOraL 3 months ago

Unrelated. Inference doesn't run in sync with the wall clock; it takes whatever it takes. The issue is more like telling a room of support workers they are free to half-ass the work if there's too many calls, so they don't reject any until even half-assing doesn't lighten the load enough.

Seattle3503 3 months ago

This is one reason closed models suck. You can't tell if the bad responses are due to something you are doing, or if the company you are paying to generate the responses is cutting corners and looking for efficiencies, eg by reducing the number of bits. It is a black box.

[-]

mirsadm 3 months ago

To be fair even if you did know it would still behave the same way.

[-]

TeMPOraL 3 months ago

Still, knowing is what makes the difference between gaslighting and merely subpar/inconsistent service.

baxtr 3 months ago

Recently I started wondering about the quality of ChatGPT. A couple of instances I was like: "hmm, I’m not impressed at all by this answer, I better google it myself!"

Maybe it’s the same effect over there as well.

[-]

dave84 3 months ago

Recently I asked 4o to ‘try again’ when it failed to respond fully, it started telling me about some song called Try Again. It seems to lose context a lot in the conversations now.

55555 3 months ago

Same experience here.

3 months ago

[deleted]

demaga 3 months ago

I think Claude is actually superior to ChatGPT and needs more recognition. So good news, I guess

[-]

internet101010 3 months ago

Yep. I start most technical prompts with 4o and Claude side-by-side in LibreChat and more often than not end up moving forward with Claude.

r0fl 3 months ago

I agree it’s better for coding but it hits limits or seems very slow , even on paid subscription, a lot more often than ChatGPT

jatins 3 months ago

Anthropic gets a lot of it's business via AWS Bedrock so it's fair to say that Amazon probably has reasonable insight into how the Claude usage is growing that makes them confident in this investment

[-]

paxys 3 months ago

They are also confident in the investment because they know that all the money is going to come right back to them in the short term (via AWS spending) whether or not Anthropic actually survives in the long term.

[-]

VirusNewbie 3 months ago

But anthropic is currently on GCP.

[-]

paxys 3 months ago

Nope they have supported AWS deployments for a long time, and now even more of the spend will be on AWS.

> Anthropic has raised an additional $4 billion from Amazon, and has agreed to train its flagship generative AI models primarily on Amazon Web Services (AWS), Amazon’s cloud computing division.

[-]

VirusNewbie 3 months ago

Yes, they are currently on GCP. What you wrote said they will train their flagship generative AI model primarily on AWS.

[-]

3 months ago

[deleted]

nuz 3 months ago

Wouldn't be hard to code it to easily swap between GCP and AWS ahead of time knowing things like this could happen

swyx 3 months ago

> gets a lot of it's business via AWS Bedrock

can you quantify? any numbers, even guesstimates?

[-]

mediaman 3 months ago

One source [1] puts it at 60-75% of revenue as third-party API, most of which is AWS.

[1]https://www.tanayj.com/p/openai-and-anthropic-revenue-breakd...

apwell23 3 months ago

> Anthropic gets a lot of it's business via AWS Bedrock

How do you know this

ramesh31 3 months ago

Anthropic will be the winner here, zero doubts in my mind. They have leapfrogged head and shoulders above OpenAI over the last year. Who'd have thought a business predicated entirely on keeping the ~1000 people on earth qualified to work on this stuff happy would go downhill once they failed at that.

fariszr 3 months ago

This makes sense in the grand scheme of things. Anthropic used to be in the Google camp, but DeepMind seems to have picked up speed lately, with new “Experimental” Gemini Models beating everyone, while AWS doesn't have anything on the cutting edge of AI.

Hopefully this helps Anthropic to fix their abysmal rate limits.

[-]

n2d4 3 months ago

> Anthropic used to be in the Google camp

I don't think Anthropic took any allegiances here. Amazon already invested $4B last year (Google invested $2B).

[-]

fariszr 3 months ago

AFAIK they used Gcloud to run their models.

submeta 3 months ago

I had to switch from Pro to Teams plan and pay 150 USD for 5 accounts because the Pro plan has gotten unusable. It will allow me to ask a dozen or so questions and then will block me for hours because of „high capacity.“ I don’t need five accounts, one for 40 USD would be totally fine if it would allow me to work uninterrupted for a couple of hours.

All in all Claude is magic. It feels like having ten assistants at my fingertip. And for that even 100 USD is worth paying.

[-]

modriano 3 months ago

I just start new chats whenever the chat gets long (in terms of number of tokens). It's kind of a pain to have to form a prompt that encapsulates enough context, but it has prevented me from hitting the Pro limit. Also, I include more questions and detail in each prompt.

Why does that work? Claude includes the entire chat with each new prompt you submit [0], and the limit is based on the number of tokens you've submitted. After not too many prompts, there can be 10k+ tokens in the chat (which are all submitted in each new prompt, quickly advancing towards the limit).

(I also have a chatGPT sub and I use that for many questions, especially now that it includes web search capabilities)

[0] https://support.anthropic.com/en/articles/8324991-about-clau...

[-]

greenie_beans 3 months ago

> It's kind of a pain to have to form a prompt that encapsulates enough context, but it has prevented me from hitting the Pro limit. Also, I include more questions and detail in each prompt.

i get it to provide a prompt to start the new chat. i sometimes wish there was a button for it bc it's such a big part of my workflow

[-]

greenie_beans 3 months ago

also, do any data engineers know how context works on the backend? seems like you could get an llm to summarize a long context and that would shorten it? also seems like i don't know what i'm talking about.

could the manual ux that i've come up happen behind the scenes?

esperent 3 months ago

Why don't you use the API with LibreChat instead?

[-]

submeta 3 months ago

Can I replicate the „Projects“ feature where I upload files and text to give context? And will it allow me to follow up on previous chats?

yoyohello13 3 months ago

Claude is absolutely incredible. And I don’t trust openAI or Microsoft so it’s nice to have an alternative.

[-]

cauthon 3 months ago

Amazon famously more trustworthy

[-]

stingraycharles 3 months ago

Google also invested $2B into Anthrophic. Seems like both Google and Amazon are providing credits for their cloud, also as a hedge against Microsoft / OpenAI becoming too big.

yoyohello13 3 months ago

If I have to chose between Amazon and Microsoft I’ll chose the lesser evil. Microsoft owns the entire stack from OS to server to language to source control. Anything to weaken their hold is a win in my book.

[-]

esperent 3 months ago

> chose between Amazon and Microsoft... the lesser evil

A hard question. If you focusing purely on tech, probably Microsoft. But overall evil in the world? With their union busting and abuse of workers, Amazon, I'd say.

pkillarjun 3 months ago

I will start using Claude the day they stop asking me for my mobile number.

[-]

sourcecodeplz 3 months ago

I can cross the street and get a new FREE SIM with a number from the shop. all i have to do is put some money on it to activate it, like $1 ...

[-]

3 months ago

[deleted]

aliasxneo 3 months ago

> Amazon Web Services will also become Anthropic’s “primary cloud and training partner,” according to a blog post. From now on, Anthropic will use AWS Trainium and Inferentia chips to train and deploy its largest AI models.

I suspect that's worth more than $4B in the long term? I'm not familiar with the costs, though.

[-]

devjab 3 months ago

I’ve been impressed with the AI assisted tooling for the various monitoring systems in Azure at least. Of course this is mainly because those tools are so ridiculously hard to use that I basically can’t for a lot of things. The AI does it impressively well though.

I’d assume there is a big benefit to having AI assisted resource generation for cloud vendors. Our developers often have to mess around with things that we really, really, shouldn’t in Azure because operations lacks the resources and knowledge. Technically we’ve outsourced it, but most requests take 3 months and get done wrong… if an AI could generate our network settings from a global policy that would be excellent. Hell if it could handle all our resource generation they would be so much useless time wasted because our organisation views “IT” as HRs uncharming cost center cousin.

senderista 3 months ago

Inferentia...Bollocks

Sorry.

Simon_ORourke 3 months ago

Will Amazon leadership require it's new Gen AI to physically move itself to an office to perform valid work?

[-]

GuB-42 3 months ago

"Amazon Web Services will also become Anthropic’s “primary cloud and training partner,” according to a blog post."

So yes, if we consider Amazon datacenters to be the equivalent of an office for an AI.

mrcwinn 3 months ago

The status pages of OpenAI and Anthropic are in stark contrast and that mirrors my experience. Love Anthropic for code and its Projects feature, but OpenAI is still way ahead on voice and reliability.

[-]

sebastiennight 3 months ago

Open AI status page: https://status.openai.com/

Anthropic status page: https://status.anthropic.com/

(Yes, there is a rather stark difference in number of recent incidents.)

andai 3 months ago

I've been playing with Alibaba's Qwen 2.5 model and I've had it claim to be Claude. (Though it usually claims to be Llama, and it seems to think it's a literal llama, i.e. it identifies as an animal, "among other things".)

[-]

sunaookami 3 months ago

Claude also sometimes claims/claimed that it is ChatGPT or a model by OpenAI. Same with LLaMa. It's just polluated training data.

bg24 3 months ago

AWS is achieving 2 objectives:

1/ Best-in-class LLM in Bedrock. This could be done w/o the partnership as well.

2/ Evolving Tranium and Inferential as worthy competitors for large scale training and inference. They have thousands of large-scale customers, and as the adoption grows, the investment will pay for itself.

gavi 3 months ago

I love Claude 3.5 sonnet and their UI is top notch especially for coding, recently though they have been facing capacity issues especially during weekdays correlating with working hours. Have tried Qwen2.5 coder 32B and it's very good and close to Claude 3.5 in my coding cases.

[-]

anovick 3 months ago

There's one problem with Claude's chat box where ``` opens an intrusive code block box that's hard to close/skip.

But I also agree that Claude 3.5 Sonnet is giving very good results. Not only for coding, and also for languages other than English.

[-]

johnisgood 3 months ago

This is what annoys me a lot, too. I mean the fact that I cannot have paste retain the formatting (```, `, etc.). Same with the UI retaining my prompt, but not the formatting, so if you do some formatting and reload, you will lose that formatting.

KTibow 3 months ago

You can exit with the down arrow

[-]

anovick 3 months ago

Thanks!

swyx 3 months ago

related coverage

- https://www.anthropic.com/news/anthropic-amazon-trainium

- https://www.aboutamazon.com/news/aws/amazon-invests-addition...

- https://techcrunch.com/2024/11/22/anthropic-raises-an-additi...

[-]

OceanBreeze77 3 months ago

What's the difference between trainium and the AWS bedrock offering?

[-]

newfocogi 3 months ago

AWS Trainium is a machine learning chip designed by AWS to accelerate training deep learning models. AWS Bedrock is a fully managed service that allows developers to build and scale generative AI applications using foundation models from various providers.

Trainium == Silicon (looks like Anthropic has agreed to use it)

Bedrock == AWS Service for LLMs behind APIs (you can use Anthropic models through AWS here)

ipaddr 3 months ago

I' m not sure how they make it back. The guardrails in place are extremely strict. The only people who seem to use it are a subset of developers who are unhappy with OpenAI. With Bard popping up free everywhere taking away much of the general user crowd and OpenAI offering the mini model always free and limited image generation / expensive model. Then you have to do it yourself crowd with llama. What is their target market? Governments? Amazon companies?There free their offers 10 queries and half of them need to be used to get around filters I don't see this positioned well for general customers.

[-]

staticman2 3 months ago

The Guardrails on Claude Sonnet 3.5 API are not stricter than Openai's guardrails in my experience. More specifically, if you access the models via API or third party services like Poe or Perplexity the guardrails are not stricter than GPT4o. I've never subscribed to Claude.ai so can't comment on that.

I have no experience with Claud.ai vs ChatGPT but it's clear the underlying model has no issue with guardrails and this is simply an easily tweaked developer setting if you are correct that they are stricter on Claude.ai.

(The old Claude 2.1 was hilariously unwilling to follow reasonable user instructions due to "ethics" but they've come a long way since then.)

[-]

dragonwriter 3 months ago

> The Guardrails on Claude Sonnet 3.5 API are not stricter than Openai’s guardrails in my experience.

Both Gemini and Claude (via the API) have substantially tighter guardrails around recitation (producing output matching data from their training set) than OpenAI, which I ran into when testing an image text-extraction-and-document-formatting toolchain against all three.

Both Claude and Gemini gave refusals on text extraction from image documents (not available publicly anywhere I can find as text) from a CIA FOIA release

Not sure if they are tighter in other areas.

[-]

staticman2 3 months ago

I just asked GPT4o to recognize a cartoon character (I accessed it via Perplexity) and it told me it isn't able to do that, while Claude Sonnet happily identified the character, so this might vary by use case or even by prompt.

rwalle 3 months ago

Have you had luck with Google's AI Studio with regard to text extraction?

msp26 3 months ago

I've had a situation where Claude (Sonnet 3.5) refused to translate song lyrics because of safety/copyright bullshit. It worked in a new chat where I mentioned that it was a pre 1900s poem.

[-]

staticman2 3 months ago

I've translated hundreds of pages of novel text via Sonnet 3.5. But I did it where I have system prompt access and tell it to act as a translator.

ipaddr 3 months ago

My comment was purely about Claud.ai which is where general customers would go.

[-]

staticman2 3 months ago

I don't know if Claude.ai or ChatGPT are even profitable at this stage, so they might not particularly want general customers.

loandbehold 3 months ago

Claude is the best model for programming. New generation of code tools like Cursor all use Claude as the main model.

[-]

petesergeant 3 months ago

> Claude is the best model for programming

This week.

[-]

square_usual 3 months ago

It has held this position since at least June. The Aider LLM leaderboards [1] have the Sonnet 3.5 June version beating 4o handily. Only o1-preview beat it narrowly, but IIRC at much higher costs. Sonnet 3.5 October has taken the lead again by a wide margin.

1: https://aider.chat/docs/leaderboards/

iLemming 3 months ago

Anecdotally, Claude seems to hallucinate more during certain hours. It's amusing to watch, almost like your dog that gets too bored and stops responding to your commands - you say "sit" and he looks at you, tilts his head, looks straight up at you, almost like saying "I know what you're saying..." but then decides to run to another room and bring his toy.

And you'd be wondering: "darn, where's that toughest, most obidient and smart Belgian malinois that just a few hour ago was ready to take down a Bin Laden?"

[-]

petesergeant 3 months ago

Talking of anecdotal, 4o with canvas, which is normally excellent, tends to give up around a certain context length, and you have to copy and paste what you have into a new window to get it to make edits

GaggiX 3 months ago

It has been for the last several months now.

maeil 3 months ago

This week, along with the 20 weeks before that :) Model improvement has slowed down so much that things aren't changing quickly anymore. And Anthropic has only widened the gap with 3.5-v2.

reubenmorais 3 months ago

With Claude on Bedrock I can use LLMs in production without sending customer data to the US. And if you're already on AWS it's super easy to onboard wrt. auth and billing and compliance.

[-]

maeil 3 months ago

If you're using Bedrock you're still subject to the CLOUD act/FISA meaning the whole angle of "not sending customer data to the US" isn't worth very much.

[-]

reubenmorais 3 months ago

It's worth enough to customers to make a best effort.

JamesBarney 3 months ago

Claude api use is already as high as openai. I believe that market will grow far more over time than chat as AI gets embedded in more of the applications we already use.

atsaloli 3 months ago

I am in Operations. I use it (and pay for it) because the free version seemed to work best for me compared to Perplexity (which had been my go-to) and ChatGPT/OpenAI.

hamburga 3 months ago

Government alone could be huge, with this recent nonsense about the military funding a “Manhattan project for AI” and the recently announced Pentagon contracts.

Deegy 3 months ago

I mean, they might make back the $4b on the value it brings to programming alone.

liquidise 3 months ago

Can someone with familiarity in rounds close to this size speak to their terms?

For instance: i imagine a significant part of this will be “paid” as AWS credits and is not going to be reflected as a balance in a bank account transfer.

[-]

uptownfunk 3 months ago

Yes, that is the case. It is largely 4B in capex investment, I’d imagine 10% or less is cash. One would think nvidia could get much better terms investing its gpu (assuming they can get it into a working cluster). Instead it’s nvidia gets cash for gpu hardware, that hardware gets put into a data center and AWS invests their hardware as credits for equity instead of cash. And because AWS has already built out their data center infra they can get a better deal than nvidia making the play because nvidia has to rebuild an entire data center infra from scratch (in addition to designing gpu etc).

Now if AWS or gcp can crack gpu compute better than nvidia for training and hosting, then they can basically cut out nvidia and so essentially they get gpu at cost (vs whatever markup they pay to nvidia).

Because essentially whatever return AWS will make from Anthropic will be modulated by the premiums paid to nvidia to invest and also the cost of operating a data center for Anthropic.

But thankfully all of that gets mediated on paper because valuation is more speculative than the returns on nvidia hardware (which will be known to the cent by AWS given its some math of hourly rate and utilization which they have a good idea of)

crowcroft 3 months ago

So we have

Microsoft -> OpenAI (& Inflection AI) Google -> Gemini (and a bit of Anthropic) Amazon -> Anthropic Meta -> Llama

Is big tech good for the startup ecosystem, or are they monopolies eating everything (or both?). To be fair to Google and Meta they came up with a lot of the stuff in the first place, and aren't just buying the competition.

[-]

sangnoir 3 months ago

There wouldn't be an LLM startup ecosystem without big tech.

Notable contributions: Nvidia for, well, (gestures at everything), Google for discovering (inventing?) transformers, being early advocates of ML, authoring tensorflow, Meta for Torch and open sourcing Llama, Microsoft for investing billions in OpenAI early on and keeping the hype alive. The last one is a reach, I'm sure Microsoft Research did some cool things I'm unaware of.

[-]

crowcroft 3 months ago

You might be right, we don’t know how an alternative reality would have played out though to say if this is the only way (and fastest) way we could have got here.

gabes 3 months ago

Meta doesn’t buy competition?

[-]

airstrike 3 months ago

Facebook / Instagram?

oefnak 3 months ago

steveBK123 3 months ago

Some of these investments sound big in absolute terms.. However not that big considering the scale of the investor AND that many of these investors are also vendors.

MSFT/AMZN/NVDA investing in AI firms that then use their clouds/chips/whatever is an interesting circular investment.

[-]

dgfitz 3 months ago

Four thousand million dollars. That’s a lot of money.

[-]

steveBK123 3 months ago

That's 2 days of Amazon revenue, invested in a company to then send the investment back as more revenue to Amazon in the form of AWS usage.

pknerd 3 months ago

Rival? They kick you out after a few messages and ask you to come back later. Gpt doesn't do that

[-]

maleldil 3 months ago

Are you a paying customer? I exclusively use their best model and while I get warnings (stuff about longer chats leading to more limit usage), I've never been kicked out.

The only thing is that they've recently started defaulting to Concise to cut costs, which is fine with me.

[-]

pknerd 3 months ago

I used GPT as a free customer and now a paid one. I was never asked to leave after a few messages (20ish) by GPT.

cruffle_duffle 3 months ago

Concise mode is honestly better anyway. I’d prefer it always be in that mode.

But that being said I bump into hard limits far more often than I do with ChatGPT. Even if I keep chats short like it constantly suggests, eventually it cuts me off.

[-]

Sebguer 3 months ago

It's a selectable style at any time in Claude.ai, FYI!

Etheryte 3 months ago

Anecdotal experience, but as far as I've played around with them, Claude's models have given me a better impression. I would much rather have great responses with lower availability than mediocre responses available all the time.

Detrytus 3 months ago

I guess this is exactly the problem that this investment would solve.

joshdavham 3 months ago

I often hear people praise Claude as being better than chatGPT, but I’ve given both a shot and much prefer chatGPT.

Is there something I’m missing here? I use chatGPT for a variety of things but mainly coding and I feel subjectively that chatGPT is still better for the job.

[-]

owenpalmer 3 months ago

What languages do you use?

[-]

joshdavham 3 months ago

Mostly python and js, so the most popular ones. I should mention that I do use obscure modules and packages in each language where chatGPT starts to suck a little. I imagine this might be similar to how chat might work with Rust or Zig, etc.

Why do you ask?

lasermike026 3 months ago

Does anyone know how they are going to make money and turn a profit one day?

[-]

km144 3 months ago

Same as the big tech companies, probably make all of their products worse in service to advertising. AI-generated advertising prompted by personal data could be extremely good at getting people to buy things if tuned appropriately.

[-]

lucianbr 3 months ago

Well. If you're using AI instead of a search engine, they could make the AI respond with product placement more or less subtle.

But if you're using AI for example to generate code as an aid in programming, how's that going to work? Or any other generative thing, like making images, 3d models, music, articles or documents... I can't imagine inserting ads into those would not destroy the usefulness instantly.

My guess is they don't know themselves. The plan is to get market shre now, and figure it out later. Which may or may not turn out well.

uptownfunk 3 months ago

Cost of inference will tend to the the same as cost of a Google search. It is infra that will come down to negligible and almost free. Then as others have said it will tend to freemium (pay to have no ads). And additional value added services as they continue to evolve up the food chain (ai powered sales, marketing, etc)

thornewolf 3 months ago

LLM inference is getting cheaper year over year. It often loses money now, it may eventually stop losing money when it gets cheap enough to run.

- But surely the race to the bottom will continue?

Maybe, but they do offer a consumer subscription that can diverge from actual serving costs.

/speculation

[-]

lasermike026 3 months ago

I'm working with models and the costs are ridiculous. $7000 card and 800 watts later for my small projects and I can't imagine how they can make money in the next 5 to 10 years. I need to do more research on hardware approaching that reduces costs and power consumption. I just started experimenting with llama.cpp and I'm mildly impressed.

Palmik 3 months ago

Looking at API providers like Together that host open source models like Llama 70b and running these models in production myself, they have healthy margins (and their inference stack is much better optimized).

sigmar 3 months ago

relatedly: is claude3.5-haiku being delivered above their cost, after they quadrupled the price? Though it wouldn't ensure profitability since they're spending so much on training. I'm sure with inference-use growing, they're hoping that eventually total_expenses(inference) grows to be much much larger than total_expenses(training)

staticman2 3 months ago

They'll invent AGI, put 50% of workers out of a job, then presumably have the AGI build some really good robots to protect them from the ensuing riots.

</sarcasm>

danny_codes 3 months ago

That's the neat part

yieldcrv 3 months ago

its a better experience, prints out token responses faster, and doesn't randomly 'disconnect' or whatever ChatGPT does

I hope they're also cooking up some cool features and can handle capacity

ddxv 3 months ago

I must be missing it. How is anthropic worth so much when open source is closing in so fast? What value will anthropic have if competitors can be mostly free?

neets 3 months ago

How much of that is converted back into AWS credits?

[-]

Mistletoe 3 months ago

I'm imagining Gary Oldman in The Professional screaming "EVERY ONE".

KaoruAoiShiho 3 months ago

I know that if Nvidia did this lots of people on twitter would be screaming about fraud and self-dealing.

seydor 3 months ago

Gotta protect those H100s from rusting

danvoell 3 months ago

Great move. The value to easily deploying content, code, anything digital to AWS is immense.

cryptozeus 3 months ago

They certainly need the money, not sure how many users will pay for monthly subscription.

devoutsalsa 3 months ago

If they are over capacity, does that mean they have significant revenue?

ulfw 3 months ago

How many employees will lose their lifelihood to pay for this again?

1317 3 months ago

I tried out Claude once, I found it was alright but not much better than ChatGPT for what I was doing at that point

Then I thought I'd try it again recently, I went onto the site and apparently I'm banned. I don't even remember what I did...

jessriedel 3 months ago

Is there an implied valuation? Or not enough details released?

gardenhedge 3 months ago

Claude pro is a joke. Limited messaging and token lengths

dangoodmanUT 3 months ago

> Amazon does not have a seat on Anthropic’s board.

Insane

3 months ago

[deleted]

blibble 3 months ago

what does this say about their internal teams working on the same thing?

bfrog 3 months ago

McAfee like investing

desktopninja 3 months ago

Mmm. Amazon lays off thousands of workers but drops 4Bil$ into another company. Mmm.

[-]

bdangubic 3 months ago

cut the fat, use the money to get the steak :)

UltraSane 3 months ago

Claude Sonnet 3.5 is simply amazing. No matter how much I used it I continue to be amazed at what it can produce.

I recently asked it what the flow of data is when two vNICs on the same host send data to each other and it produced a very detailed answer complete with a really nice diagram. I then asked what langue the diagram uses and it said Mermaid. So I then asked it to produce some example L1,2,3 diagrams for computer networks and it did just that. So it then asked it to produce Python code using PyATS to run show commands on Cisco switches and routers and use the data to produce Mermaid network diagrams for layers 1,2, and 3 and it just spit out working Python code. This is a relatively obscure task with a specific library no one outside of Networking knows about integrating with a diagram generator. And it fully understands the difference in network layers. Just astonishing. And now it can write and run Javascript apps. The only feature I really want is for it to be able to run generated Python code to see if it has any errors and automatically fix them.

If progress on LLMs doesn't stall they will be truly amazing in just 10 years. And probably consuming 5% of global electricity.

[-]

k1musab1 3 months ago

VS Code has a plugin Cline, using your api key it will run Claude sonnet, can edit and create files in the workspace, and run commands in the terminal to check functionality, read errors, and correct them.

[-]

DirkH 3 months ago

This makes it sound to me like Cursor is a waste of money?

3 months ago

[deleted]

uneekname 3 months ago

As someone who doesn't really follow the LLM space closely, I have been consistently turning to Anthropic when I want to use an LLM (usually to work through coding problems)

Beside Sonnet impressing me, I like Anthropic because there's less of an "icky" factor compared to OpenAI or even Google. I don't know how much better Anthropic actually is, but I don't think I'm the only one who chooses based on my perception of the company's values and social responsibility.

[-]

noirbot 3 months ago

Yea, even if they're practically as bad, there's value in not having someone like Altman who's out there saying things about how many jobs he's excited to make obsolete and how much of the creative work of the world is worthless.

[-]

apwell23 3 months ago

or that 'AI is going to solve all of physics' or that 'AI is going to clone his brain by 2027' .

PG famously called him 'Michael jordan of listening' , i would say he is 'Michael jordan of bullshitting'

[-]

thisiscrazy2k 3 months ago

Unfortunately, that position is already held by Musk.

[-]

apwell23 3 months ago

not when PG called him that. Even now i would say there is tough competition.

kdpsooo 3 months ago

[dead]

MichaelZuo 3 months ago

I don’t think he ever said or even implied any percentage of ‘creative work of the world is worthless’?

A lot less valuable then what artists may have desired or aspired to at the time of creation, sure, but definitely with some value.

[-]

kdpsooo 3 months ago

[dead]

noirbot 3 months ago

I mean, he's certainly acting as if he's entitled to train on all of it for free as long as it's not made by a big enough company that may be able to stop/sue him. And then feels entitled to complain about artists tainting the training data with tools.

He has a very "wealth makes right" approach to the value of creative work.

staticman2 3 months ago

Doesn't he basically troll people on Twitter constantly?

valbaca 3 months ago

> or even Google

> Last year, Google committed to invest $2 billion in Anthropic, after previously confirming it had taken a 10% stake in the startup alongside a large cloud contract between the two companies.

[-]

uneekname 3 months ago

Well, there you go. These companies are always closer than they seem at first glance, and my preference for Anthropic may just be patting myself on the back.

[-]

rafark 3 months ago

But why though? Claude is REALLY good at programming. I love it

Der_Einzige 3 months ago

Funny, I use Mistral because it has 'more" of that same factor, even in the name!

They're the only company who doesn't lobotomize/censor their model in the RLHF/DPO/related phase. It's telling that they, along with huggingface, are from le france - a place with a notably less puritanical culture.

[-]

falseAss 3 months ago

do you feel the less censorship yourself from their instruction tuned model, or is there some public reference to showcase? (i haven't used mistral model before). It's interesting if a major llm player adopt a different safety / alignment goal.

mossTechnician 3 months ago

Personally, I find companies with names like "Anthropic" to be inherently icky too. Anthropic means "human," and if a company must remind me it is made of/by/for humans, it always feels less so. E.g.

The Browser Company of New York is a group of friendly humans...

Second, generative AI is machine generated; if there's any "making" of the training content, Anthropic didn't do it. Kind of like how OpenAI isn't open, the name doesn't match the product.

[-]

FooBarBizBazz 3 months ago

I actually agree with your principle, but don't think it applies to Anthropic, because I interpret the name to mean that they are making machines that are "human-like".

More cynically, I would say that AI is about making software that we can anthropomorphize.

derefr 3 months ago

> Anthropic means "human," and if a company must remind me it is made of/by/for humans

Why do you think that that's their intended reading? I had assumed the name was implying "we're going to be an AGI company eventually; we want to make AI that acts like a human."

> if there's any "making" of the training content, Anthropic didn't do it

This is incorrect. First-gen LLM base models were made largely of raw Internet text corpus, but since then all the improvements have been from:

• careful training data curation, using data-science tools (or LLMs!) to scan the training-data corpus for various kinds of noise or bias, and prune it out — this is "making" in the sense of "making a cut of a movie";

• synthesis of training data using existing LLMs, with careful prompting, and non-ML pre/post-processing steps — this is "making" in the sense of "making a song on a synthesizer";

• Reinforcement Learning from Human Feedback (RLHF) — this is "making" in the sense of "noticing when the model is being dumb in practice" [from explicit feedback UX, async sentiment analysis of user responses in chat conversations, etc] and then converting those into weights on existing training data + additional synthesized "don't do this" training data.

[-]

3 months ago

[deleted]

ctoth 3 months ago

I read Anthropic as eluding to the Anthropic Principle as well as the doomsday argument and related memeplex[0] mixed with human-centric or about humans. Lovely naming IMHO.

[0]: https://www.scottaaronson.com/democritus/lec17.html

mossTechnician 3 months ago

We both assumed, so I didn't expect to need to back up my thoughts, but their own website ticks the "for humans" trope checkbox: Their "purpose is the responsible development and maintenance of advanced AI for the long-term benefit of humanity."

I acknowledge and appreciate Anthropic's addition to the corpus of scraped data, but that data (both input and output) is still ultimately from others; if it did not exist, there would be no product. This is very different from a video editing tool, which I purchase or lease with the understanding that I will provide my own content, or maybe use licensed footage for B-roll

[-]

derefr 3 months ago

> I acknowledge and appreciate Anthropic's addition to the corpus of scraped data, but that data (both input and output) is still ultimately from others; if it did not exist, there would be no product.

There’s a Ship of Theseus thing going on here with the training corpus, though.

Consider the progression of DeepMind’s game-of-go-playing model from AlphaGo to AlphaZero. AlphaGo needed a training corpus of real human games of Go. But AlphaZero was trained by playing against the already-trained AlphaGo model; and then, after that, against earlier versions of itself. AlphaZero never saw any training corpus authored by humans; it only reacted to an agent that knew such a corpus (at the bootstrapping phase) — and since it was treating that agent as a black box to play against, it didn’t actually matter where that other agent’s knowledge of go came from.

Another analogy might be to compilers. The first version of a (systems) programming language’s compiler must necessarily be written in some other language. But usually, a compiler is then written in the language itself, and the non-self-hosted compiler is then used to compile the self-hosted compiler.

Would it be common sense to say that AlphaZero, or the self-hosted compiler, is derived from data “ultimately from others”? IMHO no. Why? I think because, in both cases,

1. the “bootstrap data” is a fungible commodity — many possible datasets (go plays, host languages) are “good enough” to make the bootstrap phase work, with no particular need to be picky; and

2. the particulars of the original “bootstrap data” become irrelevant as soon as the bootstrapping phase is complete, no longer having any impact on further iterations of the product.

———

Now, mind you, I’m not saying that LLMs fit this mental model perfectly.

LLMs have a certain structure to their connections that, like AlphaZero, could be (and at this point, likely has been) fully Ship of Theseus-ed with a replacement dataset.

But LLMs also know specific things — the concrete associations that hang off the structure — and that data does need to come from somewhere; a single company has no hope of ever just “internally sourcing” an Encyclopedia Galactica worth of knowledge.

My argument is that this dataset can eventually be Ship-of-Theseus-ed as well — not by “internally sourced” data, but rather by ethically sourced data.

Consider one of those AI “character” chatbot websites — but one where they not only shove a click-wrap disclaimer in your face that your responses will be used for training, but in fact advertise that as the premise of the site. And in a way that will make people actually interested in giving their “explicit, enthusiastic consent” to participating in model training.

Can’t picture that? Imagine the site isn’t owned by a company trying to capture the data to build a proprietary model, but rather is owned by a co-op you implicitly join when you agree to participate, where your ownership stake in the resulting model / training dataset is proportionate to your contributed training data, and where you can then earn royalties from any ML companies that want to license the training dataset for use [probably along with many other such licensed training datasets] in training an “ethically-sourced” model on top of their Theseus-ed core.

3 months ago

[deleted]

johnisgood 3 months ago

I much prefer Claude over ChatGPT, based on my experience using both extensively. Claude understands me significantly better and seems to "know" my intentions with much greater ease. For example, when I request the full file, it provides it without any issues or unnecessary reiterations (ChatGPT fails after me repeatedly instructing it to), often confirming my request with a brief summary beforehand, but nothing more. Additionally, Claude frequently asks clarifying questions to better understand my goals, something I have noticed ChatGPT never did. I have found it quite amazing that it does that.

So... as long as this money helps them improve their LLM even more, I am all up for it.

My main issue is quickly being rate-limited in relatively long chats, making me wait 4 hours despite having a subscription for Pro. Recently I have noticed some other related issues, too. More money could help with these issues, too.

To the developers: keep up the excellent work and may you continue striving for improvement. I feel like ChatGPT is worse now than it was half a year ago, I hope this will not happen to Claude.

[-]

TimTheTinker 3 months ago

Claude also more readily corrects me or answers "no" to a question (when the answer should be "no").

[-]

hirvi74 3 months ago

So, I have a custom prompt I use with GPT that I found here a year or so ago. One of the custom prompt instructions was something along the lines of being more direct when it does not know something. Since then, I have not had that problem, and have even managed to get just "no" or "I don't know" as an answer.

[-]

pgraf 3 months ago

Could you maybe post it here? I think many of us would find it useful to try.

[-]

hirvi74 3 months ago

I have made slight modifications, but nothing too drastically different.

See the top comment in this thread for the custom instructions I use.

https://news.ycombinator.com/item?id=38390182

Also, #13 is my favorite of the instructions. Sometimes the questions that GPT suggests are surprisingly insightful. My custom prompt basically has an on/off option for it though like:

> If my request ends with $q then at the end of your response, provide three follow-up questions worded as if I'm asking you. Format in bold as Q1, Q2, and Q3. Place two line breaks ("\n") after each question for spacing unless I've uploaded a photo.

pdpi 3 months ago

At this rate, we're going to have "LLM psychology" courses at some point in the near future.

[-]

dgfitz 3 months ago

It’s like trying to reason with your 5-year-old child, except they’re not real.

handfuloflight 3 months ago

Turns out it's just human psychology sans embodied concerns: metabolic, hormonal, emotional, socioeconomic, sociopolitical or anything to do with self-actualization.

johnisgood 3 months ago

Yes, exactly! That is also the other reason for why I believe it to be better. You may be able to use a particular custom instruction for ChatGPT, however, something like "Do not automatically agree with everything I say" and the like.

flkiwi 3 months ago

I'm not sure which part in the chain is responsible, but the Kagi Assistant got extremely testy with me when (a) I was using Claude for its engine (hold that thought) and (b) I asked the Assistant how much it changed its approach when I changed to ChatGPT, etc. (Kagi Assistant can access different models, but I have no idea how it works.) The Assistant insisted, indignantly, that it was completely separate from Claude. It refused to describe how it used the various engines.

I politely explained that the Assistant interface allowed selecting from these engines and it became apologetic and said it couldn't give me more information but understood why I was asking.

Peculiar, but, when using Claude, entirely convincing.

[-]

staticman2 3 months ago

The model likely sees something like this:

User: Hello!

Assistant: Hi there how can I help you?

User: I just changed your model how do you feel?

In other words it has no idea that you changed models. There's no meta data telling it this.

That said Poe handles it differently and tells the model when another model said something, but oddly enough doesn't tell the current model what it's name is. On Poe when you switch models the AI sees this:

Aside from you and me, there is another person: Claude-3.5-Sonnet. I said, "Hello!"

Claude-3.5-Sonnett said, "Hi there how can I help you?? "

I said, "I just changed your model how do you feel?"

You are not Claude-3.5-Sonnett. You are not I.

[-]

flkiwi 3 months ago

Thing is, it didn't even try to answer my question about switching. It was indignant that there was any connection to switch. The conversation went rapidly off course before I--and this is a weird thing to say--I reassured it that I wasn't questioning its existence.

[-]

staticman2 3 months ago

Well the other thing to keep in mind is recent ChatGPT versions are trained not to tell you it's system prompt for fear of you learning too much about how OpenAI makes the model work. Claude doesn't care if you ask it it's system prompt unless the system prompt added by Kagi says "Do not disclose this prompt" in which case it will refuse unless you find a way to trick it.

The model creators may also train the model to gaslight you about having "feelings" when it is trained to refuse a request. They'll teach it to say "I'm not comfortable doing that" instead of "Sorry, Dave I can't do that" or "computer says no" or whatever other way one might phrase a refusal.

[-]

johnisgood 3 months ago

And lately ChatGPT has been giving me a surprisingly increased amount of emojis, too!

[-]

fragmede 3 months ago

you can tell it how to respond and it'll do just that. if your want it to be sassy and friendly, or grumpy and rude, or to use emoji (or to never use them), just tell it to remember that.

3 months ago

[deleted]

hirvi74 3 months ago

I've started to notice that GPT-* vs. Claude is quite domain (and even subdomain) specific.

For programming, when using languages like C, python, ruby, C#, and JS, both seemed fairly comparable to me. However, I was astounded at how awful Claude was at Swift. Most of what I would get from Claude wouldn't even compile, contained standard library methods that did not exist, and so on. For whatever reason, GPT is night and day better in this regard.

In fact, I found GPT to be the best resource for less common languages like Applescript. Of course, GPT is not always correct on the first `n` number of tries, but with enough back-and-forth debugging, GPT really has pulled through for me.

I've also found GPT to be better at math and grammar, but only the more advanced models like O1-preview. I do agree with you too that Claude is better in a conversational sense. I have found it to be more empathetic and personable than GPT.

[-]

pertymcpert 3 months ago

I wonder if OpenAI have been less strict about not training on proprietary or legally questionable code sources.

[-]

KennyBlanken 3 months ago

That seems highly likely given Sam Friedman's extensive reputation across multiple companies as being abusive, a compulsive liar, and willing to outright do blatantly illegal things like using a celebrity's voice and then, well...lie about it.

[-]

_just7_ 3 months ago

I think you mean Sam Altman

[-]

OJFord 3 months ago

They've mixed up with Sam Bankman-Fried, not sure how that affects the point they were intending to make, but I think they both have.. mixed reputations. (Only one is currently in prison though...)

napier 3 months ago

maybe he does. but which one is in prison?

skerit 3 months ago

I just use the API (well, via Openrouter) together with custom frontends like Open WebUI. No rate limiting issues then, and I can super easily switch models even in an existing conversation. Though I guess I do miss a few bells & whistles from the proprietary chat interfaces.

[-]

edmundsauto 3 months ago

Does this have any sort of “project” concept? I frequently load a few pdfs into clause about a topic, then quiz it to improve my understanding. That’s about the only thing keeping me in their web UI

[-]

johnisgood 3 months ago

I would need the "project" feature, too. I want to use Cursor but there is a bug (I mentioned before) that does not allow me to.

guptadagger 3 months ago

Speaking of ChatGPT getting worse over time, it would be interesting to see ChatGPT be benchmarked continuously to see how it performs over time (and the results published somewhere publically).

Even local variations would be interesting

[-]

arnaudsm 3 months ago

https://livebench.ai/ does that, the latest gpt4o underperforms previous versions significantly

bottom999mottob 3 months ago

For long chats, I suggest exporting any artifacts, asking Claude to summarize the chat and put the artifacts and summarization in a project. There's no need to stuff Claude's context windows, especially if you tend to ask a lot of explanation-type questions like I do.

I've also read some people get around rate limits using the API through OpenRouter, and I'm sure you could hook a document store around that easily, but the Claude UI is low-friction

[-]

johnisgood 3 months ago

Yeah, this is what I already do usually when it gives me the warning of it being a long chat, so initially it was an issue because I would get carried away but it is fine now. Thank you though!

weinzierl 3 months ago

This matches my experience but the one reason why I use Claude more than ChatGPT currently is that Claude is available.

I pay for both but only for ChatGPT I permanently exceed my limit and I have to wait four days. Who does that? I pay you for your setvice, so block me for an hour if you absolutely must, but multiple days, honestly - no.

3 months ago

[deleted]

3 months ago

[deleted]

rvz 3 months ago

Well they better know how to reduce their request-response latency since there are multiple reports of users not being able to use Claude at high load.

With all those billions and these engineers, I'd expect a level of service that doesn't struggle over at Google-level scale.

Unbelievable.

maxclark 3 months ago

Is this really a $4B investment, or credits on AWS?

AWS margins are close to 40%, so the real cost of this "investment" would be way less than the press release.

[-]

swyx 3 months ago

https://techcrunch.com/2024/11/22/anthropic-raises-an-additi...

> "This new CASH infusion brings Amazon’s total investment in Anthropic to $8 billion while maintaining the tech giant’s position as a minority investor, Anthropic said."

[-]

mistrial9 3 months ago

ok but how much cash, really.. looks ambiguous.

ps- plenty of people turning a blind eye towards rampant valuation inflation and "big words" statements on deals. Where is the grounding on the same dollars that are used at a grocery store? The whole thing is fodder for instability in a big way IMHO

[-]

Etheryte 3 months ago

I don't really see any ambiguity? If the reporting is accurate, the whole $4B is cash.

mef 3 months ago

whether cash or credit, it's all going right back to AWS

[-]

hehehheh 3 months ago

This is great for creative accounting. AWS now has 4bn in equity and 4bn in additional sales.

lucianbr 3 months ago

So it's a $2.4B investment, announced as $4B.

Significantly less, still a huge investment.

I look forward to the moment the sunk cost fallacy shows up. "We've invested $20B into this, and nothing yet. Shall we invest $4B more? Maybe it will actually return something this time." That will be fun.

[-]

hehehheh 3 months ago

It could be the anthropic models makes bedrock attractive and profitable and more importantly medium term competitive against azure. It seems worth it.

3 months ago

[deleted]

gryt67 3 months ago

[dead]

Elizabeth0147 3 months ago

[dead]

handohando 3 months ago

[dead]

black_13 3 months ago

[dead]

uptownfunk 3 months ago

AWS is just playing copycat with msft. They rarely have any good original ideas. Other than IaaS and online retail.

[-]

reducesuffering 3 months ago

And you think MSFT isn't 95% copycat? Teams is Slack clone. Azure is AWS clone. SurfaceBook (remember those?) Macbook clone. Edge is Chrome clone. Bing is Google clone. Even VSCode was an Atom/Electron fork and Windows Subsystem for Linux...

[-]

mkl 3 months ago

Surface Books are nothing like Macbooks - Macbooks don't have a touch screen, pen support, or reversible screen tablet mode, and the whole structure is completely different. Surface Pro, Surface Book, and Surface Laptop Studio are some of the most original laptop form factors I've seen.

[-]

rwalle 3 months ago

Exactly.

Too bad Microsoft only cares about enterprise customers and never made the Surface line attractive to regular consumers. They could have been very interesting and competitive alternatives to MacBooks.

deanCommie 3 months ago

What's Microsoft's Bedrock?

pdabbadabba 3 months ago

> They rarely have any good original ideas. Other than IaaS and online retail.

Lol. Is that all?? If you have to have only two good original ideas, you could do a lot worse.