Ask HN: Anyone else disillusioned with "AI experts" in their team?

33 points | by randomgermanguy 4 hours ago ago

47 comments

Hey I resemble that remark!

There's definitely a rush of people trying to upskill/reskill into this technology space despite having no formal training or background beyond basic dev skills. There's other people (such as myself) that came from the big data/NLP space (ads & search) that are trying to add AI to our extensive skillsets but aren't necessarily deep-math experts.

Unfortunately there's not a lot of room at the top and the vast majority of AI implementations at smaller companies are just OpenAI API wrappers. Essentially there's very little lived experience since it's expensive to experiment at home and smaller companies just aren't going to invest in self-hosted models that are expensive to run and quickly fall behind state of the art.

nixpulvis an hour ago

> Also, no one could tell me where exactly our "self-hosted" models even ran (turns out 50% of the time its just OpenAI/Anthropic)

This part is honestly the most worrying to me, as compliance with customers and legal would really need you to not lie about this.

rglover an hour ago

This is the nature of tech now (perhaps the whole time due to being a relatively "new" field). Most people don't have the slightest clue what they're doing beyond their ability to parrot buzzwords.

Mean? Sure. Reality? You betcha. It's incredibly rare these days to encounter truly competent professionals. Most are just hoping the guy below them doesn't know enough to spot their shortfalls and speak up.

This aligns shockingly well with Uncle Bob's rough stat: “The number of programmers doubles every five years or so. This means that half the programmers in the world have less than five years of experience.”

[-]

zurtri an hour ago

This median of 5yrs experience is also backed up by Stack Overflow surveys (and Python surveys).

So where do they all go (I doubt the number of grads is doubling)?

I think a lot realise that programming is not their bag and move into account management, IT support, Business Support, or even other career's entirely.

b-karl an hour ago

The field of AI has become very big in the recent years and people are becoming more and more specialized just like the in the rest of software or R&D. There’s all sorts of model building and development, integration in other traditional software, infrastructure, deployment and devops and now also all the governance and compliance for a lot of fields. It’s good for people working in those stacks to understand the full chain at some level but pretty fast you will have experts in particular parts of the chain and no one will understand it all.

And then there’s of course career climbers playing politics and people getting into the field because of interest or resume building.

mathattack 2 hours ago

How large is the company?

Many times plum resume building assignments at big companies go to the best politicians rather than the biggest experts.

[-]

randomgermanguy 2 hours ago

Less then 100 people all-in-all.

[-]

mathattack 2 hours ago

Shouldn’t be a political play then.

In a good market I’d say you should look around. In this market, keep your head down and get some experience.

[-]

alexpotato an hour ago

In my experience, politics can happen at any sized company.

e.g. I've worked at firms with 300,000 people and 150 people. The smaller company had MORE politics in some ways.

Or as a manager of mine in Hong Kong once said:

"In Chinese there is a saying: if there are 3 people then there will be politics"

thefourthchime an hour ago

I've been at companies as small as 10 and as large as 30,000. and there is no lack of politics in smaller companies from what I've seen.

bogomipblips 2 hours ago

You are obviously someone who would appreciate pedantic style educators. The average customer is not at all appreciative of not being met where they are which tends to lead to an expert using AI as short for OpenAI, etc, with the customer unless they use a careful hedging..

Similarly if they participated in all the early arguments about where your models would be located then they have no idea now that they are fed up with the endless thread of subtle change requests.

AJRF 2 hours ago

Do you think sampling is deterministic?

[-]

randomgermanguy 2 hours ago

Topk sampling with temp = 0 should be pretty much deterministic (ignoring floating-point errors)

[-]

AJRF 2 hours ago

> Ignoring floating point errors.

I think you mean non-associativity.

And you can’t ignore that.

[-]

sunrunner 2 hours ago

Ignoring floating point errors, assuming a perfectly spherical cow, and taking air resistance as zero.

[-]

AJRF an hour ago

Imagine you are predicting the next token, you have two tokens very close in probability in the distribution, kernel execution is not deterministic because of floating point non-associativity - the token that gets predicted impacts the tokens later in the prediction stream - so it's very consequential which one gets picked.

This isn't some hypothetical - it happens all the time with LLM's - it isn't some freak accident that isn't probable

[-]

randomgermanguy an hour ago

Okay yes, but would you really say that the main part of non-determinism in LLM-usage stems from this ? No its obviously the topk sampling.

I don't think my tech-lead was trying to suggest the floating-point error/non-associativity was the real source.

[-]

AJRF an hour ago

> Would you really say that the main part of non-determinism in LLM-usage stems from this

Yes I would because it causes exponential divergence (P(correct) = (1-e)^n) and doesn't have a widely adopted solution. The major labs have very expensive researchers focused on this specific problem.

There is a paper from Thinking Machines from September around Batch Invariant kernels you should read, it's a good primer on this issue of non-determinism in LLM's, you might learn something from it!

Unfortunately the method has quite a lot of overhead, but promising research all the same.

[-]

randomgermanguy 42 minutes ago

Alright fair enough.

I dont think this is relevant to the main-point, but it's definitely something I wasn't aware of. I would've thought it might have an impact on like O(100)th token in some negligible way, but glad to learn.

illwrks 4 hours ago

I work in house and had a similar AI agency day over the last few months.

I came to the same observations; lots of experts not much expertise.

I think my wider team are on par with their ability and understanding so we now can sift through the BS a bit easier.

Nod, smile, accept that no one has a clear understanding.

[-]

illwrks 4 hours ago

The real talent is building this stuff, everyone else is just part of the marketing effort.

[-]

assemblyman 2 hours ago

I find this obsession with building strange. It's a very SaaS Silicon Valley mindset. There are whole swathes of very talented engineers who spend most of their time debugging, characterizing systems, doing performance analysis and resolving bottlenecks. Some of it might require writing significant code but mostly it's writing small test cases. The key skill is to treat a computing system as the object of study and to be a good empirical scientist (which requires understanding theory pretty well). These are people with deep expertise in networking, GPUs, CPUs, memory etc. One only has to look at national labs that do large-scale HPC (high-performance computing) to see examples.

One can argue that a lot of "building with AI" is commoditized by fine-tuning and RAG libraries or even reduced to prompt engineering. A lot of it is also tricks that might work on one dataset but not others. Putting together libraries fueled by pizza and coke gives an illusion of skill and speed.

Are there grifters who are jumping onto the AI bandwagon? Of course! In spades. Are there also engineers who want to build up their skills and are failing to do so or in the process of doing so? Of course, this happens too! But there are also people who are trying to understand, debug and improve models who are not necessarily "building". After all, the scaling laws paper (the original one) was a result of pure analysis of empirical data.

randomgermanguy 4 hours ago

What do you mean with "building this stuff"? As in building LLMs, or building applications on-top of them.

[-]

illwrks 3 hours ago

Building LLM’s. In my mind those engineers are the ones that have more intimate knowledge of the data and input, and can create the LLM’s for their specific tasks. Everyone else is a customer to them.

I can tell you how a house is built, that doesn’t make me a builder that makes me informed and opinionated. I can decorate my house however I like but im not a painter/decorator or a tradesman. I can assemble some ikea furniture, but I’m not a carpenter. I’m a consumer and I can tweak something to my liking but I can’t do anything significant.

PaulHoule 4 hours ago

No, it is being able to evaluate models. 5 builders without eval produce zero value. 1 eval person can pick and choose the best model out of a bunch that are open source and commercial and maybe one of them is good enough. Put an eval person together with N builders and you have a chance of making a good enough model.

If you want to know why Hacker News is full of people disappointed or skeptical with AI ask yourself why they put 99.9% of their effort into “zero-shot” when it is clear as day that if you get a few thousand examples and train in that you wipe the floor with “zero-shot”

randomgermanguy 4 hours ago

But why do you think this is? Like is it just the money/status that comes with calling yourself an "AI"-expert ?

[-]

illwrks 3 hours ago

I try and frame things from an agency perspective.

Agencies are like a production line, they need raw materials coming in; clients with cash, armed with opportunities, scraps of ideas or formed briefs to be worked on. They need this business so they can generate the output and keep the lights on.

AI is everywhere and everything for a lot of people now. You can be sure that Exec’s are asking their teams how are we using AI, how is it helping the business grow etc. However there’s so much AI news, it’s moving so quick and seeping into everything that difficult (from a naïeve client point of view) to know what’s fantasy and what’s reality.

So my perception is… agencies do the sifting and maintain visibility of what is real or not because they have to start drumming up future sales and business, and AI is hot right now.

Perhaps they have some training in CoPilot etc, or with some experience of creating a model, maybe they have integrated something small with something big. It may even be that being ann angency they have a more open way of working that a corporate does, and that’s the sell.

Anyway, the sales teams will proclaim themselves experts because they have to sell.

nis0s 3 hours ago

Maybe this is a joke, there’s a lot of talent out there. But if you’re not kidding, start looking for a job somewhere else on the dl, this ship isn’t fit to steer.

[-]

randomgermanguy 3 hours ago

Not a joke. I'm still a student, like half a year away from finishing my masters, so switching jobs at this point feels bit risky/early? Maybe im wrong though

[-]

ethmarks 2 hours ago

I don't think they meant "quit your job", just "be on the lookout for another".

If your current job is unstable because nobody there knows what they're doing, it's good to have a fallback.

echelon_musk 2 hours ago

Halve is a verb, half is the noun you were looking for.

brailsafe 2 hours ago

Eh, probably overthinking it, let it matter if it matters, if it's your liability, but otherwise take the opportunity for what it is. Focus on what you're there to do. The purpose of any of it is to bring money in, and if that happens.

Early on in my career I was hyper-fixated on building features correctly at this particular company, according to what I thought was a proper way to build websites. I was probably right, but my job wasn't to be right, my job was to get things done in a certain period of time according to whatever people who controlled the money at the company thought was important, not what a nerd would necessarily care about.

When you're in school or just graduated, you're basically qualified to start learning (outside academia) and it's important to pay attention to what other people value, then do your best within that until you have the power to determine what's worth valuing.

Insanity 2 hours ago

I mean.. to play devil’s advocate.. they don’t need to understand how LLMs fundamentally work any more than how most programmers don’t understand assembly if all they do is build agents and prompt engineering lol.

dboreham 44 minutes ago

LLMs (and ML before that) have attracted a class of hand-waving bullshitter. Hardly surprising --- anyone who knows what they're doing in whatever field is going to be busy doing their thing. Meanwhile some new hot tech comes along, who has the time to poke into it? Mr Useless who never had anything to do. Meanwhile we're digging into the math of transformers and finding it fascinating while they're goofing around with "prompt engineering".

zippyman55 3 hours ago

I learned a huge amount from my team members, and they were usually smarter than I was. So, sure, I occasionally corrected them and impressed them, but all the bread crumbs they threw out, I gobbled up and learned a ton. But, they never said stupid stuff! (Ok, maybe about social events, but they were geeks). When team members spout off erroneous stuff and it is not "occasionally" you have to question what you are going to be learning from your co-workers. So, your whole team may be inferior and contribute little to your learning and growth. This extends to management and the decisions they make regarding AI.

bawolff 2 hours ago

The entire AI ecosystem is a giant hype bubble. I dont really think it matters much if your team understands AI, the bubble is going to pop either way.

[-]

randomgermanguy 34 minutes ago

I guess the question is if it's like the crypto-bubble, where theres no real value left in the end (haven't heard of a good use for those ASICs). Or more like the dot-com bubble where fiber-cable installed is still valuable without pets.com around.

But since I wasn't really around for either of those ... ¯\_ (ツ)_/¯

incomingpain 3 hours ago

>We had an internal-workshop led by our internal AI-team (mostly just LLMs), and had the horrible realisation that no one in that team actually knows what the term "AI" even means, or how a language model works.

I'm the AI expert for my org. Everyone else is more or less opposed to AI.

>One senior-dev (team-lead also) tried to explain to me that AI is a subfield of machine-learning, and always stochastic in nature (since ChatGPT responds differently to the same prompt).

machine learning is the sub field of AI.

Not really stochastic as far as I know. The whole random seed and temperature thing is a bit of a grey area for my full understanding. Let alone the topk, top p, etc. I often just accept what's recommended from the model folks.

>We/they are selling tailor-made "AI-products" to other businesses, but apparently we don't know how sampling works...?

Sales people dont tend to know jack. That doesnt mean they dont have an introvert in the back who does know what's going on.

>Am I just too junior/naive to get this or am I cooked?

AI for the most part has been out a couple years. With rapid improvement and changes that make 2023 knowledge obsolete. 100% of us are juniors in AI.

You're disillusioned because the "ai experts" basically dont exist.

[-]

randomgermanguy 3 hours ago

> machine learning is the sub field of AI.

That's what I tried to explain then as well, and i brought up stuff like path-finding algorithms for route-finding (A*/heuristic-search) as an more old-school AI part, which didn't really land I think.

> Not really stochastic as far as I know. The whole random seed and temperature thing is a bit of a grey area for my full understanding. Let alone the topk, top p, etc. I often just accept what's recommended from the model folks.

I mean LLMs are often treated in stochastic nature, but like ML models aren't usually? Like maybe you have some dropout, but that's usually left out during inference AFAIK. I dont think a Resnet or YOLO is very stochastic, but maybe someone can correct me.

> AI for the most part has been out a couple years.

With this you just mean LLMs right? Because I understand AI to be way more then just LLMs & ML

[-]

sohojoe 2 hours ago

yeah, stochastic is there because we give up control of order of operations for speed

so the order in which floating-point additions happen is not fixed because of how threads are scheduled, how reductions are structured (tree reduction vs warp shuffle vs block reduction)

Floating-point addition is not associative (because of rounding), so: - (a + b) + c can differ slightly from a + (b + c). - Different execution orders → slightly different results → tiny changes in logits → occasionally different argmax token.

[-]

rolisz 2 hours ago

Actually, that's a misconception. It's because of varying batch sizes that requests get scheduled on: https://thinkingmachines.ai/blog/defeating-nondeterminism-in...

randomgermanguy 2 hours ago

Oh actually yeah that's true. You have correctly out-nitpicked my nitpick lol.

But at that point i feel like we are getting close to "everything that isn't a perfect Turing-machine is somewhat-stochastic" ;)

Edit: someone corrected me above, it does seem to matter more then I thought

rvz 3 hours ago

> One senior-dev (team-lead also) tried to explain to me that AI is a subfield of machine-learning, and always stochastic in nature (since ChatGPT responds differently to the same prompt).

This "senior dev" has it all mixed up and is incorrect.

"AI" is all encompassing umbrella term that includes other fields of "AI" such as the very old GOFAI (good old fashioned AI) which is rule-based, machine learning (statistical, bayesian) methods, and neural networks which deep learning and more recently generative AI (which ChatGPT) uses.

More accurately, it is neural networks which are more "stochastic" with their predictions and decisions, not just transformer models which ChatGPT is based on.

> Am I just too junior/naive to get this or am I cooked?

Quite frankly, the entire team (except you) is cooked, as you have realized what you don't know.

[-]

thefourthchime an hour ago

I wonder if the senior dev actually said LLM, or at least meant LLM. If he said that, most of this checks out. The only thing is that they don't have to be stochastic, but in practice they almost always are.

randomgermanguy 3 hours ago

Okay thanks for saving my sanity somewhat.

And also just to nitpick/joke:

> More accurately, it is neural networks which are more "stochastic" with their predictions and decisions <...>

I would defend NNs to not even be necessarily stochastic. I had to handwrite weights for NNs in atleast two exams, to fit XOR for example ;)

[-]

etrautmann an hour ago

that may be the exception that proves the rule here though. Outside of the tiniest toy example is this ever true?

nonameiguess 2 hours ago

I did my undergrad in applied math and MS in machine learning, worked writing automated trading algorithms for a few years before drifting into infra layer, and there is no universe in which I'd consider myself anywhere remotely close to an expert in AI and I'm really not sure such people exist outside of the senior leadership at major labs, i.e. the LeCun/Hinton types.

But I know enough to know neither AI nor machine learning are subfields of the other. AI just developed out of the very earliest days of electronic computing as an expression of the desire to get intelligent behavior out of computers by any means possible. Machine learning arose from the desire to express functions in which we know the inputs and outputs but not the form of the function itself, so we use various estimation methods that can be learned from the data itself. A whole lot of overlap and parallel efforts simultaneously developed the same or similar techniques between computer scientists and software engineers on the one side and statisticians and applied mathematicians on the other side. It seemed to have turned out that statistical methods generally seem to provide the best algorithms for machine learning, and machine learning has seemed to provide the best algorithms to get intelligent behavior out of computers.

So they've kind of grown together, stats, automated learning, and AI, but they're still distinct things that developed independently of one another and still exist independently of one another.

This is putting aside all the various "big data" technologies and efforts that grew out of the 2007 or so era of collecting enormous amounts of user or machine-generated data that required new tech to store, query, and new ways to perform parallel batch processing often married to the storage and query tech, all of which was necessary for and enabled statistical machine learning to become as successful as it has become, but is completely separate from the mathematical and algorithmic discipline itself.

Even the guys I named above are probably not really experts in all of these things separately. As with anything, it takes a village.