After just spending 15 minutes trying to get something useful accomplished, anything useful at all, with latest beta Apple Intelligence with a M1 iPad Pro (16G RAM), this article appealed to me!
I have been running the 32B parameters qwen2.5-coder model on my 32G M2 Mac and and it is a huge help with coding.
The llama3.3-vision model does a great job processing screen shots. Small models like smollm2:latest can process a lot of text locally, very fast.
Open source front ends like Open WebUI are improving rapidly.
All the tools are lining up for do it yourself local AI.
The only commercial vendor right now that I think is doing a fairly good job at an integrated AI workflow is Google. Last month I had all my email directed to my gmail account, and the Gemini Advanced web app did a really good job integrating email, calendar, and google docs. Job well done. That said, I am back to using ProtonMail and trying to build local AIs for my workflows.
I am writing a book on the topic of local, personal, and private AIs.
I wrote a script to queue and manage running llama vision on all my images and writing the results to an sqlite db used by my Media Viewer, and now I can do text or vector search on it. It's cool to not have to rely on Apple or Google to index my images and obfuscate how they're doing it from me. Next I'm going to work on a pipeline for doing more complex things like multiple frames in a video, doing multiple passes with llama vision or other models to separate out the OCR, description, and object, people recognition. Eventually I want to feed all of this in here https://lowkeyviewer.com/ and have the ability to manually curate the automated classifications and text.
I'm curious why you find descriptions of images useful for searching. I developed a similar flow and ended up embedding keywords into the image metadata instead. It makes them easily searchable and not tied to any databases, and it is faster (dealing with tens of thousands of images personally).
It's not as good as tags but it does pretty ok for now especially since searching for specific text in an image is something I want to do a lot. I'm trying to work on getting llama to output according to a user defined tagging vocabulary/taxonomy and ideally learn from manual classifications. Kind of a work in progress there.
This is the prompt I've been using.
"Create a structured list of all of the people and things in the image and their main properties. Include a section transcribing any text. Include a section describing if the image is a photo, comic, art, or screenshot. Do not try to interpret, infer, or give subjective opinions. Only give direct, literal, objective descriptions of what you see."
> I'm trying to work on getting llama to output according to a user defined tagging vocabulary/taxonomy and ideally learn from manual classifications. Kind of a work in progress there.
Good luck with that. The only thing that I found that works is using gbnf to force it, which slows inference down considerably.
I can't speak to the OPs decision, but I also have a similar script set up that adds a combination of YOLO, bakllava, tesseract etc. and also puts it along with a URI reference to the image file into a database.
I actually store the data in the EXIF as well, but the nice thing about having a database is that it's significantly faster than attempting to search hundreds of thousands of images across a nested file structure, particularly since I store a great deal of media on a NAS.
Another thought: OpenAI has done a good enough job productizing ChatGPT with advanced voice mode and now also integrated web search. I don’t know if I would trust OpenAI with access to my Apple iCloud data, Google data, my private GitHub repositories, etc., but given their history of effective productization, they could be a multi-OS/platform contender.
Still, I would really prefer everything running under my own control.
I would interpret that as somebody working on ML algorithms and architectures, not somebody developing a product that uses some form of AI at runtime...
correct - but "developing a product that uses some form of AI at runtime" is a job that people do and the community has consolidated around "AI engineer" as shorthand https://www.latent.space/p/ai-engineer
i argue that its useful to have these shorthands to quickly understand what people do. its not so nice to just be default suspicious at a perhaps lower-technical-difficulty job that nevertheless a lot of comapnies want and a lot of people do
But we think it is, due to our learned industrious—which is the opposite of learned helplessness. We associate difficulty with reward. And we feel discomfort when really valuable things become very easy. Strange times.
I haven't seen this sort of work called AI Developer yet, but I may have missed the trend shift. Isn't the convention for this still to use the title of Machine Learning Engineer (MLE)?
There is another thread here that seems to confirm it's me who has missed a trend shift. I didn't know this term had a special meaning. I just followed the thought that an x-developer is somebody developing x.
Are you saying that true Data Engineers typically do more than just use Tableau or run OLAP queries, or do you see the title 'Data Engineer' itself as a bit of a red flag these days? I’m pretty early in my career and was leaning toward Data Engineering, but hearing stuff like this makes me wonder if going for SWE might be smarter.
Yep, that's where I am at - the amount of times people have talked to me about the most recent podcast where they've heard how a new tool will solve all their problems and really its some samey mix of CDC and bad python+sql is... a lot.
I think there's not a ton of political palatability for realizing most of their projects are like one API and a sane use of some SQL away.
For starters, Engineer only makes sense if the person actually holds an engineering degree, taken at an institution validated by the Engineering Order.
That's a legalism that isn't universal. Personally, I think that anyone who engages in engineering is logically an engineer. Maybe not a certified engineer, but an engineer nonetheless.
"The creative application of scientific principles to design or develop structures, machines, apparatus, or manufacturing processes, or works utilizing them singly or in combination; or to construct or operate the same with full cognizance of their design; or to forecast their behavior under specific operating conditions; all as respects an intended function, economics of operation and safety to life and property"[1]
I feel the same way about “UX Designer/Engineer”. Seems to mean someone who can put a wireframe together but has no design chops. Any good designer that has experience crafting user interfaces should be skilled at thinking through in-depth about what the user is doing when using the product and how to successfully guide them through that process.
Well, most are called "project manager" now. But it would still be a giant red flag, just like the project manager job title or even worse, using PM so you don't know exactly what it means.
You would be surprised how many people have no idea what "machine learning" means (not even technical definition but just as in the field). I'm working on a PhD in an adjacent field and basically have to tell people I work in AI for them to have any semblance of what I do.
No, it's not crazy at all. Back in the early noughts everyone was associating themselves with Internet and WorldWideWeb (yes, they used to spell that out). Same thing happening today with AI. It is irritating ...
Of the great developers I have worked with in real life, across a large number of projects and workplaces, very few have any Github presence. Most don't even have LinkedIn. They usually don't have any online presence at all: No blog with regular updates. No Twitter presence full of hot takes.
Sometimes this industry is a lot like the "finance" industry: People struggling for credibility talk about it constantly, everywhere. They flex and bloviate and look for surrogates for accomplishments wherever they can be found. Peacocking on github, writing yet another tutorial on what tokens are and how embeddings work, etc.
That obviously doesn't mean in all cases, and there are loads of stellar talents that have a strong online presence. But by itself it is close to meaningless, and my experience is that it is usually a negative indicator.
I think the actual truth is somewhere halfway. Hard to not have any GitHub presence if you use enough open source projects, since you can't even report bugs to many of them without an account.
But if you mostly mean in the sense that they don't have a fancy GitHub profile with a ton of followers... I agree, that does seem to be the case.
LinkedIn on the other hand... I sincerely can't imagine networking on LinkedIn being a fraction as useful as networking... Well, at work. For anyone that has a decent enough resume LinkedIn is basically only useful if you want a gutter trash news feed and to be spammed by dubious job offers 24/7. Could be wrong, but I'm starting to think you should really only stoop to LinkedIn in moments of desperation...
coding since the ‘90’s - do not have a github account at all that I ever used for anything. of the top say 10 people I have ever worked with not a single one has a github account.
there is an entire industry of devs who work on meaningful projects, independently or for an employer who solve amazing problems and do amazing sh*t none of which is public or will ever be public.
I have signed more NDAs in my career than I have commits in public repos :)
> I have signed more NDAs in my career than I have commits in public repos :)
Someone like you is extremely experienced and skilled, and has a reputation in your industry. You started working before it was normal and trivial to build stuff in public. Such activities were even frowned upon if I recall correctly (a friend got fired merely for emailing a dll to a friend to debug a crash; another was reprimanded for working on his own projects after hours at the office, even though he never intended to ever sell them).
That you have a reputation means posting work publicly would be of little to no value to your reputation. All the people who need to know you already do or know someone reputable who does.
As a hiring manager LinkedIn profiles are a huge window into someone's professional side, just like github may be a huge window into their technical side.
As a candidate I think LinkedIn is absolute trash - and toxic with all the ghost jobs and dark patterns. Feels like everyone else has the same opinion, or at the very least it is a common one. But when I have 50 candidates to review for 5 open positions, LinkedIn is going to give me great insight that I cannot really get anywhere else. I keep my profile current in the hopes that I'm not doing it for LinkedIn, but for the person researching me to see if I am a valid candidate for their role.
Eh, I've found LinkedIn moderately useful as a way to apply for jobs and share my CV. A few application forms have a field for it too.
Still have no idea how anyone can even hope to use the news feed there though. It's just a seemingly random torrent of garbage 24/7, with the odds of you getting any traction being virtually non existent.
<< I sincerely can't imagine networking on LinkedIn being a fraction as useful as networking.
I have removed my profile some time ago. While I agree with you in general, I have to say that initially LinkedIn was actually not bad and I did get some value out of it. It is a little harder now to not have it, because initial interviews scoff at its absence ( it apparently sends a negative signal to HR people ), but an established network of colleagues can overcome that.
I guess it is a little easier for people already established in their career. I still am kinda debating some sort of replacement for it, but I am honestly starting to wonder if github is just enough to stop HR from complaining.
Well said! Have you seen the titles and summary of those LinkedIn profiles? Not an ounce of humility. I'm afraid it's only going to get worse with "AI".
> It's really not that difficult to show what you've accomplished if you claim to be in a field.
Actually it is incredibly difficult, because you no longer have access to your previous employers' code bases and even if you do, it is illegal for you to show it to anyone.
> Actually it is incredibly difficult, because you no longer have access to your previous employers' code bases.
So the person never does anything outside of his employer's IP? That's unfortunate, but as a heuristic, I'd like to see stuff that the person has done if they claim to be in a field.
Perhaps other people don't care, and will be convinced by expertise without evidence, but I'm busy, and my life is hard enough already: show me your code or gtfo. :-)
It takes a special someone to work 40-50 hrs per week, writing hard creative software, then go home and write hard creative software in a different domain, while also balancing family/life.
Also, unless you are in CA many companies have extensive IP assignment clauses, which makes moonlighting on other projects potentially questionable.(especially if they are assholes)
My previous job made it hard to even submit bugs/fixes to open source projects we used internally. Often we just forked b/c bureaucracy (there's a reason it was my previous job)
Not saying your wrong, seeing someone's code is nice. As long as you are aware that you are leaving alot on the table by excluding those that do not have a presence. (Particularly older with kids)
Newsflash: the majority of working, paid developers do not do any programming outside of their employer's IP.
Someone who worked on successful projects that shipped and are still out there, they can point you to that. You can buy the app, or device with their embedded code, or use the website or whatever. Not always an option for everyone, or not all the time.
That's one reason why there are skill tests in interviews. And why people ask for, and contact, references.
Public code can't be trusted. If you make that the yardstick for hiring people, then everyone and his dog will spin up some phony github repo with stuff that can't be confirmed to have been written by them. Goodhart's Law and all that.
You have no idea how much help someone had with the code in their github repo, or to what extent it is cribbed from somewhere else. Enter AI into the picture now, too.
When assessing a candidate that didn't come with a reliable recommendation or similar short circuiting I spend a short time chatting to learn a little about their personality, then I ask for code, suggesting they show a public repo where they keep some of their personal stuff.
If they can't I give the option to write some code that they like and they think shows off what they can do, usually suggesting to spend half an hour or a couple of hours on it.
To me it's an obvious red flag if there is nothing. It's as if talking to a graphics designer or photographer and they're like "no, sorry, I can't show a portfolio, I've only done secretive proprietary work and have never done anything for fun or practice".
Those that show me something get to talk about it with me. A casual chat about style, function, possible improvements and so on. Usually when they have nothing to show they also don't put in that half an hour to make something up, or get back with excuses about lack of inspiration or whatever, and that's where I say "thanks, but no thanks, good luck".
If you can't easily initiate a git repo and whip something up and send it to me in half an hour you won't be a good fit. It means you aren't fluent and lack in experience. I might consider internship, or a similar position where you mainly do something else, perhaps you're good at managing Linux servers or something but want to learn how to design programs and develop software as well.
Thanks for the newsflash! It's always great to hear that snarky boomer anachronism. Do you remember how you'd be watching cartoons then suddenly a dramatic interruption, a banner flying across the screen, clarions sounding, and bam! Newsflash! Upon which your parents tell you to hush because "this is important!".
> the majority of working, paid developers do not do any programming outside of their employer's IP.
Dude. Explain that to your recruiter. I'm just a random and we're hanging out talking about transformers then a guy walks into the bar and proclaims, "I am an AI engineer!" Just show me some code already or what are we even talking about when you say you're an AI engineer?
This isn't a moral issue. I'm glad that you (and allegedly majority of developers) are employed and never do anything outside of work. But for the purposes of a conversation, what are you talking about in a field if you have nothing at all to show in public or to your peer group?
> Public code can't be trusted. […] You have no idea how much help someone had with the code in their github repo, or to what extent it is cribbed from somewhere else. Enter AI into the picture now, too.
Code either compiles and runs or it does. The author can either explain it or doesn't understand it.
That's why we would have a conversation: For instance, if I'm hiring (or if I'm being hired), or we're just having a semi-serious conversation about technical stuff, we typically walk each other through something difficult that we solved. We discuss the code we've written and show it. We talk about stuff and that's how we figure out how much we both know about the topic.
Why is this even remotely controversial? ;-) Do you guys never talk to anyone about what you do or what you know? Do you never help friends or colleagues or even acquaintances discuss technical problems at their companies or startups or with their freelance projects? Do you never just discuss technical stuff after having a meal with friends, when you open a laptop and read code? Do you even never write code with friends for fun or to figure out new stuff about novel problems?
> That's one reason why there are skill tests in interviews. And why people ask for, and contact, references.
Thanks for explaining to me how recruiting and employment works. Thankfully I'm an adult who has existed on the planet for some time. I imagine you assume I've never filed a reference letter? But I've also seen reference letters being gamed just like every other reputation system so ...
Anyway, out of curiosity, if you've literally never done anything other than what your employer gave you, how did you even become a programmer? You left college or high-school and walked straight into a job then learned to code there, or what?
> The author can either explain it or doesn't understand it.
I've never been challenged to explain any of the code my CV points to. I could have made it all up. If they randomly asked about something I have not looked at in a long a while, it could actually look like I don't know it! There is so much of it that I would have to study the code as a full time activity to be able to fluently spout off about any random sample of it.
I think I'm going to rearrange my resume to move the free software stuff farther down and maybe shorten it. It could come across as a negative.
Some hiring people genuinely don't understand why someone would put in a 40 hour week and then do some more on evenings and weekends. Well, I don't know how many. In my career I heard something along those lines from around two. Not everyone will tell you.
> You left college or high-school and walked straight into a job then learned to code there, or what?
It doesn't describe me, but does almost everyone I've ever known in the field (other than distant strangers met through online free-software-related channels, who are an obviously biased sample).
If you hire an accountant, do you expect to see the books of their other clients? When you choose a doctor, do you expect to see the charts of their prior patients?
And frankly, when you hire a manager or executive, there's not generally a single artifact that you could use to examine their value-add. You can see perhaps the trajectory of a company or a division or the size of their team over time, but you can't see the pile of ongoing decisions and conversations that produce actual value.
I think the flip side regarding code is, the stuff I do for fun outside of my employer's IP is not at all representative of what I do at work, or how. I pick topics that interest me, work in languages that my company doesn't use, etc, and because my purpose is learning and exploration, often it doesn't end up as a finished, working, valuable piece of tech. I deliberately don't do anything too close to my actual work both b/c that just feels like working longer and because I'm concerned it would make ownership of the code a bit fuzzy, and perhaps it would be inappropriate to consider open sourcing. Because my side projects are eclectic and didactic, I rarely put it in a public repo -- but it has served its purpose of teaching me something. If I shared all of my code side projects, they would show an unfocused person dabbling in a range of areas and not shipping anything useful, because that's what's fun ... whereas at work, I am focused in a few quite narrow areas, and working on customer-facing features, because the point is to build what the company needs rather than what I enjoy building.
If I'm shopping for an accountant I will present them with one or two cases and see how they would reason about them. It's not as easy to do with a physician.
The main difference between those professions and people who build software for a living is that they have organisations that predate modernity that keep tabs on their members and kick them out if they misbehave.
We should absolutely build such organisations, but there will be intense confrontations with industry and academia when we try, because capitalists hate when other people unionise and academics think they're the best at evaluating peers.
It's fine that your personal projects aren't polished products. They show your interests and some of how you adapt to and solve problems. It's something you've chosen to do because you wanted to, and not something you did because you were pressured by profit. The everyday grind at work wouldn't show what you'd do under more exceptional circumstances, which is where your personal character and abilities might actually matter, but what you do for fun or personal development likely does.
> If you hire an accountant, do you expect to see the books of their other clients? When you choose a doctor, do you expect to see the charts of their prior patients?
I'm genuinely sorry for you if you think a programmer is analogous to an accountant or doctor. Presumably you are one of those people who think that a Software Engineer is as much an engineer as a Civil Engineer?
Anyway, let me not speculate as to your state of mind. You wrote a lot of words. What exactly are you even trying to defend or attack or justify? Is the number of your words commensurate to the informational content of what you think you're explaining? Can you explain your point in 1 sentence? Or perhaps I should just ask an LLM to summarize it. smh.
Even if an ML/AI/software engineer has a public GH with projects on it, there's no strong reason to expect it will be a useful signal about their expertise.
> Even if an ML/AI/software engineer has a public GH with projects on it, there's no strong reason to expect it will be a useful signal about their expertise.
That's only true if you don't know how to read code. I simply read their code and based on my level of expertise, I can determine if someone is at least at my level or if they are incompetent.
>Saying "GitHub" is just a way of saying: "Show me what you've accomplished."
Do you actually think all development happens in public GitHub repos? Do you even think a majority does? Even a strong minority?
Across a number of enormous, well-known projects I've worked on, covering many thousands of contributors, including several very well known names, 0% of it exists in public Github repos. The overwhelming bulk of development is happening in the shadows.
If your "field" is "open source software", then sure. But if you're confused into thinking Github -- at least the tiny fraction that you can see -- is "the field" of software development or even just generally providing solutions, I can understand your weird belief about this.
> Do you actually think all development happens in public GitHub repos? Do you even think a majority does? Even a strong minority?
Are you actually able to read what I wrote? I said "GitHub" is a euphemism. I'm happy to read your papers on arxiv or elsewhere, for instance.
Try not to be too horny to reply in opposition and instead just read what another human is saying first. Otherwise your replies in high dudgeon are a waste of my time and don't teach me anything new.
Agree that the use of "AI engineers" is confusing. Think this blog should use the term "engineering software with AI-integration" which is different from "AI engineering" (creating/designing AI models) and different from "engineering with AI" (using AI to assist in engineering)
The term AI engineer is now pretty well recognised in the field (https://www.latent.space/p/ai-engineer), and is very much not the same as an AI researcher (which would be involved in training and building new models). I'd expect an AI engineer to be primarily a software developer, but with an excellent understanding of how to implement, use and evaluate LLMs in a production environment, including skills like evaluation and fine-tuning. This is not some dataset you can just bundle in software developer.
You find issues when they surface during your actual use case (and by "smoke testing" around your real-world use case). You can often "fix" issues in the base model with additional training (supervised fine-tuning, reinforcement learning w/ DPO, etc).
There's a lot of tooling out there making this accessible to someone with a solid full-stack engineering background.
Training an LLM from scratch is a different beast, but that knowledge honestly isn't too practical for everyday engineers given even if you had the knowledge you wouldn't necessarily have the resources necessary to train a competitive model. Of course you could command a high salary working for the orgs who do have these resources! One caveat is there are orgs doing serious post-training even with unsupervised techniques to take a base model and reeaaaaaally bake in domain-specific knowledge/context. Honestly I wonder if even that is unaccessible to pull off. You get a lot of wiggle-room and margin for error when post-training a well-built base model because of transfer learning.
I feel like I see this comment fairly often these days, but nonetheless, perhaps we need to keep making it - the AI generated image there is so poor, and so off-putting. Does anyone like them? I am turned off whenever I see someone has used one on a post, with very few exceptions.
Is it just me? Why are people using them? I feel like objectively they look like fake garbage, but obviously that must be my subjective biases, because people keep using them.
Some people have no taste, and lack the mental tools to recognize the flaws and shortcomings of GANN output. People who enthuse about the astoundingly enthralling literary skills of LLMs tend to be the kind of person who hasn't read many books. These are sad cases: an undeveloped palate confusing green food coloring and xylitol for a bite of an apple.
Some people can recognize these shortcomings and simply don't care. They are fundamentally nihilists for whom quantity itself is the only important quality.
Either way, these hero images are a convenient cue to stop reading: nothing of value will be found below.
In a world where anyone can ask an LLM to gish-gallop a plausible facsimile of whatever argument they want in seconds, it is simply untenable to give any piece of writing you stumble upon online the benefit of the doubt; you will drown in counterfeit prose.
The faintest hint that the author of a piece is a "GenAI" enthusiast (which in this case is already clear from the title) is immediate grounds for dismissing it; "the cover" clearly communicates the quality one might expect in the book. Using slop for hero images tells me that the author doesn't respect my time.
I don't find the image poor, but somehow I see immediately that it is generated because of the stylistic style. And that simply triggers the 'fake' flag in the back of my head, which has this bad subjective connotation. But objectively I believe it is a very nice picture.
I just don't understand how he didn't take 10 seconds to review the image before attaching it. If the image is emblematic of the power of AI, I wouldn't have a lot of faith in the aforementioned company.
If you're going to use GenAI (stable diffusion, flux) to generate an image, at least take the time to learn some basic photobashing skills, inpainting, etc.
You aren't exaggerating! There are some creepy arms in that image, along with the other weirdness. I'm surprised Karpathy of all people used such a poor quality image for such a post.
I think AI images can be very nice, I like to use them myself. I don't use images I don't personally like very much. So if you don't like them, it is not because, AI, it is because your taste and my taste don't match. Or maybe you would like them, if you didn't have a bias against AI. What I love about AI images is that you can often generate very much the thing you want. The only better alternative would be to hire an actual human to do that work, and the difference in price here is huge, of course.
It is like standing in front of a Zara, and wondering why people are in that shop, and not in the Versace shop across town. Surely, if you cannot afford Versace, you rather walk naked?
Normally, I'd just dismiss this as ChatGPT being trained not to disagree, but as a non-native speaker this has me doubting myself: Is prompt really pronounced rompt?
It feels like it can't possible be true, but on the other hand, I'm probably due for having my confidence in my English completely shattered again by learning another weird word's real pronunciation, so maybe this is it.
I'm a native english speaker, and prompt is pronounced "promt" ( /prɒmt/ in my roughly General American accent). Ie, there is a silent "p", but it's the second one, not the first.
I am not a native English speaker either, but I am fairly certain it's pronounced "promt", with the second "p", the one between the "m" and the "t", merging into those sounds to the point of being inaudible itself.
Also, I too asked ChatGPT and it told me that In the word "prompt," there are no silent P's. All the letters in "prompt" are pronounced, including the P.
You might say this is about Helix being small and trying to break into a crowded market, but OpenAI and Google offered similar contests / offers that asked users to submit ideas for LLM applications. Considering how many LLM sample apps are either totally useless ("Walter the Bavarian, a chatbot who gives trivia about Oktoberfest!") or could be better solved by classical programming ("a GPT that automatically converts currencies to USD!), it seems AI developers have struggled to find a single marketable use case of LLMs outside of codegen.
At a very minimum I'd say they'll have a way to "chat" with the apps to ask questions / do stuff. Either via APIs that the OS calls or integrated in the app via whatever frameworks will rise to handle this. In 5 to 10 years we'll probably see this. At a very minimum searching docs and "guide" the users through the functionality / do it straight up when asked.
Basically what chatgpt did for chatbots, but at app level. There are lots of apps that take a long time to master. But the average joe doesn't need to master them. If I want to lightly edit some photos, I know photoshop can do it, but I have no clue where that specific thing is in the menus, because I haven't used it in 10 years. But it would be cool to type in a chat box "take all the pictures from my sd card, adjust the colors, straighten the ones that need it, and put them in my Pictures folder under "trip to the sea". And then I can go do something else for the 30-60 minutes it would have taken me to google how to do all of that, or script something, etc.
The ideea of an assistant that can work like that isn't that far-fetched today, IMO. The apps need to expose some APIs, and the "os" needs an language -> action model capable enough to handle basic stuff for average joes. I'd bet good money sonnet3.5 + proper APIs + a bit of fine-tuning could do it today for 50%+ of average user cases.
An AI engineer with some experience today can easily pull down 700K-1M TC a year at a bigtech. They must be unaware that the "barriers are coming down fast". In reality it's a full time job to just _keep up with research_. And another full time job to try and do something meaningful with it. So yeah, you can all be AI engineers, but don't expect an easy ride.
I run an ML team in fintech, and am currently hiring. If a resumè came across my desk with this "skill set" I'd laugh my ass off. My job and my team's jobs are extremely stressful because we ship models that impact people's finances. If we mess up our customers lose their goddamn minds.
Most of the ML candidates I see now are all "working with LLMs". Most of the ML engineers I know in the industry who are actually shipping valuable models, are not.
Cool, you made a chatbot that annoys your users.
Let me know when you've shipped a fraud model that requires four 9's, 100ms latency, with 50,000 calls an hour, 80% recall and 50% precision.
I mean, sure, anyone can cobble together Ollama and a wrapper API and an adjusted system prompt, or go serious with Bumblebee on the BEAM.
But that's akin to web devs of old that stitched up some cruft in Perl or PHP and got their databases wiped by someone entering a SQL username. Yes, it kind of works under ideal conditions, but can you fix it when it breaks? Can you hedge against all or most relevant risks?
Probably not. Don't put it your toys into production, and don't tell other people you're a professional at it until you know how to fix and hedge and can be transparent about it with the people giving you money.
After just spending 15 minutes trying to get something useful accomplished, anything useful at all, with latest beta Apple Intelligence with a M1 iPad Pro (16G RAM), this article appealed to me!
I have been running the 32B parameters qwen2.5-coder model on my 32G M2 Mac and and it is a huge help with coding.
The llama3.3-vision model does a great job processing screen shots. Small models like smollm2:latest can process a lot of text locally, very fast.
Open source front ends like Open WebUI are improving rapidly.
All the tools are lining up for do it yourself local AI.
The only commercial vendor right now that I think is doing a fairly good job at an integrated AI workflow is Google. Last month I had all my email directed to my gmail account, and the Gemini Advanced web app did a really good job integrating email, calendar, and google docs. Job well done. That said, I am back to using ProtonMail and trying to build local AIs for my workflows.
I am writing a book on the topic of local, personal, and private AIs.
I wrote a script to queue and manage running llama vision on all my images and writing the results to an sqlite db used by my Media Viewer, and now I can do text or vector search on it. It's cool to not have to rely on Apple or Google to index my images and obfuscate how they're doing it from me. Next I'm going to work on a pipeline for doing more complex things like multiple frames in a video, doing multiple passes with llama vision or other models to separate out the OCR, description, and object, people recognition. Eventually I want to feed all of this in here https://lowkeyviewer.com/ and have the ability to manually curate the automated classifications and text.
I'm curious why you find descriptions of images useful for searching. I developed a similar flow and ended up embedding keywords into the image metadata instead. It makes them easily searchable and not tied to any databases, and it is faster (dealing with tens of thousands of images personally).
* https://github.com/jabberjabberjabber/LLavaImageTagger
It's not as good as tags but it does pretty ok for now especially since searching for specific text in an image is something I want to do a lot. I'm trying to work on getting llama to output according to a user defined tagging vocabulary/taxonomy and ideally learn from manual classifications. Kind of a work in progress there.
This is the prompt I've been using.
"Create a structured list of all of the people and things in the image and their main properties. Include a section transcribing any text. Include a section describing if the image is a photo, comic, art, or screenshot. Do not try to interpret, infer, or give subjective opinions. Only give direct, literal, objective descriptions of what you see."
> I'm trying to work on getting llama to output according to a user defined tagging vocabulary/taxonomy and ideally learn from manual classifications. Kind of a work in progress there.
Good luck with that. The only thing that I found that works is using gbnf to force it, which slows inference down considerably.
I can't speak to the OPs decision, but I also have a similar script set up that adds a combination of YOLO, bakllava, tesseract etc. and also puts it along with a URI reference to the image file into a database.
I actually store the data in the EXIF as well, but the nice thing about having a database is that it's significantly faster than attempting to search hundreds of thousands of images across a nested file structure, particularly since I store a great deal of media on a NAS.
Another thought: OpenAI has done a good enough job productizing ChatGPT with advanced voice mode and now also integrated web search. I don’t know if I would trust OpenAI with access to my Apple iCloud data, Google data, my private GitHub repositories, etc., but given their history of effective productization, they could be a multi-OS/platform contender.
Still, I would really prefer everything running under my own control.
Who can trust a company whose name contradicts its presence.
Not me. I learned that lesson after I tried to take a bite out of my Apple Macintosh.
I don’t disagree with you!
can llama 3.3 vision do things like "there's a textbox/form field at location 1000, 800 with label "address"" ?
I did a quick and dirty prototype with Claud for this, but it returned everything with an offset and/or scaled.
Would be a killer app to be able to auto-fill any form using OCR.
Have you tried RAG on Open WebUI. How does it do in asking questions from source docs?
Not yet. It has ‘Knowledge sources’ that you can set up, and I think that supplies data for built in RAG - but I am not sure until I try it.
Is anyone instantly suspicious when they introduce themselves these days an "AI Developer"
I would interpret that as somebody working on ML algorithms and architectures, not somebody developing a product that uses some form of AI at runtime...
I work on such a team, and we don't refer to ourselves as "AI engineers". We're devs working on developing deep learning systems (but not genAI).
correct - but "developing a product that uses some form of AI at runtime" is a job that people do and the community has consolidated around "AI engineer" as shorthand https://www.latent.space/p/ai-engineer
i argue that its useful to have these shorthands to quickly understand what people do. its not so nice to just be default suspicious at a perhaps lower-technical-difficulty job that nevertheless a lot of comapnies want and a lot of people do
Difficulty ≠ Value
But we think it is, due to our learned industrious—which is the opposite of learned helplessness. We associate difficulty with reward. And we feel discomfort when really valuable things become very easy. Strange times.
> devs working on developing deep learning systems (but not genAI)
Catchy job title.
I work on such a team too. I don't care what you call me - pay me
I haven't seen this sort of work called AI Developer yet, but I may have missed the trend shift. Isn't the convention for this still to use the title of Machine Learning Engineer (MLE)?
There is another thread here that seems to confirm it's me who has missed a trend shift. I didn't know this term had a special meaning. I just followed the thought that an x-developer is somebody developing x.
Yes, just like "Data Engineer" for knowing how to use Tableau or doing OLAP queries.
Are you saying that true Data Engineers typically do more than just use Tableau or run OLAP queries, or do you see the title 'Data Engineer' itself as a bit of a red flag these days? I’m pretty early in my career and was leaning toward Data Engineering, but hearing stuff like this makes me wonder if going for SWE might be smarter.
The better paying role is a Backend SWE (usually). A really a good data engineer is just a Backend SWE who has specific experience.
A bad one is a SQL analyst whose been promoted too much.
as as data engieer/data architect , agreed. there are allot of 'tools' experts who are data engineers but can't code a lick.
Yep, that's where I am at - the amount of times people have talked to me about the most recent podcast where they've heard how a new tool will solve all their problems and really its some samey mix of CDC and bad python+sql is... a lot.
I think there's not a ton of political palatability for realizing most of their projects are like one API and a sane use of some SQL away.
For starters, Engineer only makes sense if the person actually holds an engineering degree, taken at an institution validated by the Engineering Order.
That's a legalism that isn't universal. Personally, I think that anyone who engages in engineering is logically an engineer. Maybe not a certified engineer, but an engineer nonetheless.
"The creative application of scientific principles to design or develop structures, machines, apparatus, or manufacturing processes, or works utilizing them singly or in combination; or to construct or operate the same with full cognizance of their design; or to forecast their behavior under specific operating conditions; all as respects an intended function, economics of operation and safety to life and property"[1]
[1] https://en.wikipedia.org/wiki/Engineering
LOL! A butcher cuts meat and so does a surgeon. Who would you prefer to operate on you?
So is anyone that cooks a Chef.
"Chef" is a specific job title. Anyone who has that job, regardless of qualifications, is a "chef", yes.
Yeah but can they deliver, and handle a kitchen, like one that actually has a diploma in Culinary arts?
My Kiss The Chef apron says yes.
having data engineer in my title has led to recruiters calling about DBA type roles. ugh.
I feel the same way about “UX Designer/Engineer”. Seems to mean someone who can put a wireframe together but has no design chops. Any good designer that has experience crafting user interfaces should be skilled at thinking through in-depth about what the user is doing when using the product and how to successfully guide them through that process.
Eh. Titles dont really mean anything. Do we want to start the age old software engineer vs software developer title debate again?
Let someone call themselves whatever they want. If they can do the job they were hired for then... who cares?
Yes, we want, a bootcamp isn't a three to five years degree with a possible exam.
To be fair, the market dictates the job titles.
Instant red flag. Like "Scrum master" used to be back in the day.
> back in the day
Well, most are called "project manager" now. But it would still be a giant red flag, just like the project manager job title or even worse, using PM so you don't know exactly what it means.
"AI engineer" is the new "Web developer".
I prefer the term "Scrumlord" /s
There are some funny comments that use that term. This one’s my favorite. https://news.ycombinator.com/item?id=25721837
I’m instantly suspicious when anyone uses the term “AI”.
You would be surprised how many people have no idea what "machine learning" means (not even technical definition but just as in the field). I'm working on a PhD in an adjacent field and basically have to tell people I work in AI for them to have any semblance of what I do.
in 2024 thats as crazy as being suspicious of anyone that uses the term “internet”
I guess your perception depends on how many tech hype cycles you’ve lived through. I’ve lived through three for “AI” alone.
Who cares what it is called? I care about the capabilities.
In my experience when people say things like this what they're just projecting insecurity
No, it's not crazy at all. Back in the early noughts everyone was associating themselves with Internet and WorldWideWeb (yes, they used to spell that out). Same thing happening today with AI. It is irritating ...
I say “I work in machine learning, what kids these days call AI” ;).
> Is anyone instantly suspicious when they introduce themselves these days an "AI Developer"
I'm only suspicious if they don't simultaneously and eagerly show me their Github so that I can see what they've accomplished.
Of the great developers I have worked with in real life, across a large number of projects and workplaces, very few have any Github presence. Most don't even have LinkedIn. They usually don't have any online presence at all: No blog with regular updates. No Twitter presence full of hot takes.
Sometimes this industry is a lot like the "finance" industry: People struggling for credibility talk about it constantly, everywhere. They flex and bloviate and look for surrogates for accomplishments wherever they can be found. Peacocking on github, writing yet another tutorial on what tokens are and how embeddings work, etc.
That obviously doesn't mean in all cases, and there are loads of stellar talents that have a strong online presence. But by itself it is close to meaningless, and my experience is that it is usually a negative indicator.
I think the actual truth is somewhere halfway. Hard to not have any GitHub presence if you use enough open source projects, since you can't even report bugs to many of them without an account.
But if you mostly mean in the sense that they don't have a fancy GitHub profile with a ton of followers... I agree, that does seem to be the case.
LinkedIn on the other hand... I sincerely can't imagine networking on LinkedIn being a fraction as useful as networking... Well, at work. For anyone that has a decent enough resume LinkedIn is basically only useful if you want a gutter trash news feed and to be spammed by dubious job offers 24/7. Could be wrong, but I'm starting to think you should really only stoop to LinkedIn in moments of desperation...
coding since the ‘90’s - do not have a github account at all that I ever used for anything. of the top say 10 people I have ever worked with not a single one has a github account.
there is an entire industry of devs who work on meaningful projects, independently or for an employer who solve amazing problems and do amazing sh*t none of which is public or will ever be public.
I have signed more NDAs in my career than I have commits in public repos :)
Dark matter developers
https://www.hanselman.com/blog/dark-matter-developers-the-un...
> I have signed more NDAs in my career than I have commits in public repos :)
Someone like you is extremely experienced and skilled, and has a reputation in your industry. You started working before it was normal and trivial to build stuff in public. Such activities were even frowned upon if I recall correctly (a friend got fired merely for emailing a dll to a friend to debug a crash; another was reprimanded for working on his own projects after hours at the office, even though he never intended to ever sell them).
That you have a reputation means posting work publicly would be of little to no value to your reputation. All the people who need to know you already do or know someone reputable who does.
As a hiring manager LinkedIn profiles are a huge window into someone's professional side, just like github may be a huge window into their technical side.
As a candidate I think LinkedIn is absolute trash - and toxic with all the ghost jobs and dark patterns. Feels like everyone else has the same opinion, or at the very least it is a common one. But when I have 50 candidates to review for 5 open positions, LinkedIn is going to give me great insight that I cannot really get anywhere else. I keep my profile current in the hopes that I'm not doing it for LinkedIn, but for the person researching me to see if I am a valid candidate for their role.
LinkedIn remains a reasonable substitute for emailing around PDFs of resumes/CVs.
Any usage beyond that, for "networking" or "social sharing" is terrible.
Eh, I've found LinkedIn moderately useful as a way to apply for jobs and share my CV. A few application forms have a field for it too.
Still have no idea how anyone can even hope to use the news feed there though. It's just a seemingly random torrent of garbage 24/7, with the odds of you getting any traction being virtually non existent.
<< I sincerely can't imagine networking on LinkedIn being a fraction as useful as networking.
I have removed my profile some time ago. While I agree with you in general, I have to say that initially LinkedIn was actually not bad and I did get some value out of it. It is a little harder now to not have it, because initial interviews scoff at its absence ( it apparently sends a negative signal to HR people ), but an established network of colleagues can overcome that.
I guess it is a little easier for people already established in their career. I still am kinda debating some sort of replacement for it, but I am honestly starting to wonder if github is just enough to stop HR from complaining.
If someone has to tell you either themselves of by proxy they are influential in the industry ... they are not.
Well said! Have you seen the titles and summary of those LinkedIn profiles? Not an ounce of humility. I'm afraid it's only going to get worse with "AI".
I’ve found this to be especially true of the great business minded devs that I’ve come across. They’re not giving you crap for free. Pardon my French.
It's spelled crêpe
> Of the great developers I have worked with in real life, across a large number of projects and workplaces
Dear God, let's not pretend to be obtuse or facile. Saying "GitHub" is just a way of saying: "Show me what you've accomplished."
If someone has no accomplishments anywhere to show then someone else might find them credible but I don't, as a heuristic.
It's really not that difficult to show what you've accomplished if you claim to be in a field.
> It's really not that difficult to show what you've accomplished if you claim to be in a field.
Actually it is incredibly difficult, because you no longer have access to your previous employers' code bases and even if you do, it is illegal for you to show it to anyone.
> Actually it is incredibly difficult, because you no longer have access to your previous employers' code bases.
So the person never does anything outside of his employer's IP? That's unfortunate, but as a heuristic, I'd like to see stuff that the person has done if they claim to be in a field.
Perhaps other people don't care, and will be convinced by expertise without evidence, but I'm busy, and my life is hard enough already: show me your code or gtfo. :-)
It takes a special someone to work 40-50 hrs per week, writing hard creative software, then go home and write hard creative software in a different domain, while also balancing family/life.
Also, unless you are in CA many companies have extensive IP assignment clauses, which makes moonlighting on other projects potentially questionable.(especially if they are assholes)
My previous job made it hard to even submit bugs/fixes to open source projects we used internally. Often we just forked b/c bureaucracy (there's a reason it was my previous job)
Not saying your wrong, seeing someone's code is nice. As long as you are aware that you are leaving alot on the table by excluding those that do not have a presence. (Particularly older with kids)
Newsflash: the majority of working, paid developers do not do any programming outside of their employer's IP.
Someone who worked on successful projects that shipped and are still out there, they can point you to that. You can buy the app, or device with their embedded code, or use the website or whatever. Not always an option for everyone, or not all the time.
That's one reason why there are skill tests in interviews. And why people ask for, and contact, references.
Public code can't be trusted. If you make that the yardstick for hiring people, then everyone and his dog will spin up some phony github repo with stuff that can't be confirmed to have been written by them. Goodhart's Law and all that.
You have no idea how much help someone had with the code in their github repo, or to what extent it is cribbed from somewhere else. Enter AI into the picture now, too.
When assessing a candidate that didn't come with a reliable recommendation or similar short circuiting I spend a short time chatting to learn a little about their personality, then I ask for code, suggesting they show a public repo where they keep some of their personal stuff.
If they can't I give the option to write some code that they like and they think shows off what they can do, usually suggesting to spend half an hour or a couple of hours on it.
To me it's an obvious red flag if there is nothing. It's as if talking to a graphics designer or photographer and they're like "no, sorry, I can't show a portfolio, I've only done secretive proprietary work and have never done anything for fun or practice".
Those that show me something get to talk about it with me. A casual chat about style, function, possible improvements and so on. Usually when they have nothing to show they also don't put in that half an hour to make something up, or get back with excuses about lack of inspiration or whatever, and that's where I say "thanks, but no thanks, good luck".
If you can't easily initiate a git repo and whip something up and send it to me in half an hour you won't be a good fit. It means you aren't fluent and lack in experience. I might consider internship, or a similar position where you mainly do something else, perhaps you're good at managing Linux servers or something but want to learn how to design programs and develop software as well.
> Newsflash
Thanks for the newsflash! It's always great to hear that snarky boomer anachronism. Do you remember how you'd be watching cartoons then suddenly a dramatic interruption, a banner flying across the screen, clarions sounding, and bam! Newsflash! Upon which your parents tell you to hush because "this is important!".
> the majority of working, paid developers do not do any programming outside of their employer's IP.
Dude. Explain that to your recruiter. I'm just a random and we're hanging out talking about transformers then a guy walks into the bar and proclaims, "I am an AI engineer!" Just show me some code already or what are we even talking about when you say you're an AI engineer?
This isn't a moral issue. I'm glad that you (and allegedly majority of developers) are employed and never do anything outside of work. But for the purposes of a conversation, what are you talking about in a field if you have nothing at all to show in public or to your peer group?
> Public code can't be trusted. […] You have no idea how much help someone had with the code in their github repo, or to what extent it is cribbed from somewhere else. Enter AI into the picture now, too.
Code either compiles and runs or it does. The author can either explain it or doesn't understand it.
That's why we would have a conversation: For instance, if I'm hiring (or if I'm being hired), or we're just having a semi-serious conversation about technical stuff, we typically walk each other through something difficult that we solved. We discuss the code we've written and show it. We talk about stuff and that's how we figure out how much we both know about the topic.
Why is this even remotely controversial? ;-) Do you guys never talk to anyone about what you do or what you know? Do you never help friends or colleagues or even acquaintances discuss technical problems at their companies or startups or with their freelance projects? Do you never just discuss technical stuff after having a meal with friends, when you open a laptop and read code? Do you even never write code with friends for fun or to figure out new stuff about novel problems?
> That's one reason why there are skill tests in interviews. And why people ask for, and contact, references.
Thanks for explaining to me how recruiting and employment works. Thankfully I'm an adult who has existed on the planet for some time. I imagine you assume I've never filed a reference letter? But I've also seen reference letters being gamed just like every other reputation system so ...
Anyway, out of curiosity, if you've literally never done anything other than what your employer gave you, how did you even become a programmer? You left college or high-school and walked straight into a job then learned to code there, or what?
> The author can either explain it or doesn't understand it.
I've never been challenged to explain any of the code my CV points to. I could have made it all up. If they randomly asked about something I have not looked at in a long a while, it could actually look like I don't know it! There is so much of it that I would have to study the code as a full time activity to be able to fluently spout off about any random sample of it.
I think I'm going to rearrange my resume to move the free software stuff farther down and maybe shorten it. It could come across as a negative.
Some hiring people genuinely don't understand why someone would put in a 40 hour week and then do some more on evenings and weekends. Well, I don't know how many. In my career I heard something along those lines from around two. Not everyone will tell you.
> You left college or high-school and walked straight into a job then learned to code there, or what?
It doesn't describe me, but does almost everyone I've ever known in the field (other than distant strangers met through online free-software-related channels, who are an obviously biased sample).
> If they randomly asked about something I have not looked at in a long a while, it could actually look like I don't know it!
Typically the interviewer asks: "Tell me about something you worked on in this list of stuff you provided."
An interview isn't designed to trick you into failing random questions. It's to find out what you care about. You choose what to talk about. :-)
At least, that's how I engage in conversations. I want you to decide what you want to talk about so that I can get to know you.
If you hire an accountant, do you expect to see the books of their other clients? When you choose a doctor, do you expect to see the charts of their prior patients?
And frankly, when you hire a manager or executive, there's not generally a single artifact that you could use to examine their value-add. You can see perhaps the trajectory of a company or a division or the size of their team over time, but you can't see the pile of ongoing decisions and conversations that produce actual value.
I think the flip side regarding code is, the stuff I do for fun outside of my employer's IP is not at all representative of what I do at work, or how. I pick topics that interest me, work in languages that my company doesn't use, etc, and because my purpose is learning and exploration, often it doesn't end up as a finished, working, valuable piece of tech. I deliberately don't do anything too close to my actual work both b/c that just feels like working longer and because I'm concerned it would make ownership of the code a bit fuzzy, and perhaps it would be inappropriate to consider open sourcing. Because my side projects are eclectic and didactic, I rarely put it in a public repo -- but it has served its purpose of teaching me something. If I shared all of my code side projects, they would show an unfocused person dabbling in a range of areas and not shipping anything useful, because that's what's fun ... whereas at work, I am focused in a few quite narrow areas, and working on customer-facing features, because the point is to build what the company needs rather than what I enjoy building.
If I'm shopping for an accountant I will present them with one or two cases and see how they would reason about them. It's not as easy to do with a physician.
The main difference between those professions and people who build software for a living is that they have organisations that predate modernity that keep tabs on their members and kick them out if they misbehave.
We should absolutely build such organisations, but there will be intense confrontations with industry and academia when we try, because capitalists hate when other people unionise and academics think they're the best at evaluating peers.
It's fine that your personal projects aren't polished products. They show your interests and some of how you adapt to and solve problems. It's something you've chosen to do because you wanted to, and not something you did because you were pressured by profit. The everyday grind at work wouldn't show what you'd do under more exceptional circumstances, which is where your personal character and abilities might actually matter, but what you do for fun or personal development likely does.
> If you hire an accountant, do you expect to see the books of their other clients? When you choose a doctor, do you expect to see the charts of their prior patients?
I'm genuinely sorry for you if you think a programmer is analogous to an accountant or doctor. Presumably you are one of those people who think that a Software Engineer is as much an engineer as a Civil Engineer?
Anyway, let me not speculate as to your state of mind. You wrote a lot of words. What exactly are you even trying to defend or attack or justify? Is the number of your words commensurate to the informational content of what you think you're explaining? Can you explain your point in 1 sentence? Or perhaps I should just ask an LLM to summarize it. smh.
Even if an ML/AI/software engineer has a public GH with projects on it, there's no strong reason to expect it will be a useful signal about their expertise.
> Even if an ML/AI/software engineer has a public GH with projects on it, there's no strong reason to expect it will be a useful signal about their expertise.
That's only true if you don't know how to read code. I simply read their code and based on my level of expertise, I can determine if someone is at least at my level or if they are incompetent.
>Saying "GitHub" is just a way of saying: "Show me what you've accomplished."
Do you actually think all development happens in public GitHub repos? Do you even think a majority does? Even a strong minority?
Across a number of enormous, well-known projects I've worked on, covering many thousands of contributors, including several very well known names, 0% of it exists in public Github repos. The overwhelming bulk of development is happening in the shadows.
If your "field" is "open source software", then sure. But if you're confused into thinking Github -- at least the tiny fraction that you can see -- is "the field" of software development or even just generally providing solutions, I can understand your weird belief about this.
> Do you actually think all development happens in public GitHub repos? Do you even think a majority does? Even a strong minority?
Are you actually able to read what I wrote? I said "GitHub" is a euphemism. I'm happy to read your papers on arxiv or elsewhere, for instance.
Try not to be too horny to reply in opposition and instead just read what another human is saying first. Otherwise your replies in high dudgeon are a waste of my time and don't teach me anything new.
This is exactly what we're talking about here and you are proving our point.
"after years of working in DevOps, MLOps, and now GenAI"
You truly know how to align yourself with hype cycles?
They missed out on drones and blockchain.
... and Cloud! Don't forget The Klaaooud ...
Well, resume driven development does work, it seems.
I don't want to be an "AI engineer" in the way the article means. There's nothing about that sort of job that I find interesting or exciting.
I hope there will still be room for devs in the future.
Is that really AI engineering or Software engineering with AI?
If a model goes sideways how do you fix that? Could you find and fix flaws in the base model?
Agree that the use of "AI engineers" is confusing. Think this blog should use the term "engineering software with AI-integration" which is different from "AI engineering" (creating/designing AI models) and different from "engineering with AI" (using AI to assist in engineering)
The term AI engineer is now pretty well recognised in the field (https://www.latent.space/p/ai-engineer), and is very much not the same as an AI researcher (which would be involved in training and building new models). I'd expect an AI engineer to be primarily a software developer, but with an excellent understanding of how to implement, use and evaluate LLMs in a production environment, including skills like evaluation and fine-tuning. This is not some dataset you can just bundle in software developer.
You find issues when they surface during your actual use case (and by "smoke testing" around your real-world use case). You can often "fix" issues in the base model with additional training (supervised fine-tuning, reinforcement learning w/ DPO, etc).
There's a lot of tooling out there making this accessible to someone with a solid full-stack engineering background.
Training an LLM from scratch is a different beast, but that knowledge honestly isn't too practical for everyday engineers given even if you had the knowledge you wouldn't necessarily have the resources necessary to train a competitive model. Of course you could command a high salary working for the orgs who do have these resources! One caveat is there are orgs doing serious post-training even with unsupervised techniques to take a base model and reeaaaaaally bake in domain-specific knowledge/context. Honestly I wonder if even that is unaccessible to pull off. You get a lot of wiggle-room and margin for error when post-training a well-built base model because of transfer learning.
I wonder if either could be really be called engineering.
I feel like I see this comment fairly often these days, but nonetheless, perhaps we need to keep making it - the AI generated image there is so poor, and so off-putting. Does anyone like them? I am turned off whenever I see someone has used one on a post, with very few exceptions.
Is it just me? Why are people using them? I feel like objectively they look like fake garbage, but obviously that must be my subjective biases, because people keep using them.
Some people have no taste, and lack the mental tools to recognize the flaws and shortcomings of GANN output. People who enthuse about the astoundingly enthralling literary skills of LLMs tend to be the kind of person who hasn't read many books. These are sad cases: an undeveloped palate confusing green food coloring and xylitol for a bite of an apple.
Some people can recognize these shortcomings and simply don't care. They are fundamentally nihilists for whom quantity itself is the only important quality.
Either way, these hero images are a convenient cue to stop reading: nothing of value will be found below.
> these hero images are a convenient cue to stop reading.
If you don't like such content. But I would say don't judge a book by its cover.
"GenAI" slop is a time-vampire.
In a world where anyone can ask an LLM to gish-gallop a plausible facsimile of whatever argument they want in seconds, it is simply untenable to give any piece of writing you stumble upon online the benefit of the doubt; you will drown in counterfeit prose.
The faintest hint that the author of a piece is a "GenAI" enthusiast (which in this case is already clear from the title) is immediate grounds for dismissing it; "the cover" clearly communicates the quality one might expect in the book. Using slop for hero images tells me that the author doesn't respect my time.
I don't find the image poor, but somehow I see immediately that it is generated because of the stylistic style. And that simply triggers the 'fake' flag in the back of my head, which has this bad subjective connotation. But objectively I believe it is a very nice picture.
Edit: i looked at it more closely and the way the people are sitting, their direction, is totally unnatural.
That picture doesn't even have anything to do with the contents of the post either.
Reminds me of the image attached to Karpathy's (one of the founding members of openAI) twitter post on founding an education AI lab:
https://x.com/karpathy/status/1813263734707790301
I just don't understand how he didn't take 10 seconds to review the image before attaching it. If the image is emblematic of the power of AI, I wouldn't have a lot of faith in the aforementioned company.
If you're going to use GenAI (stable diffusion, flux) to generate an image, at least take the time to learn some basic photobashing skills, inpainting, etc.
You aren't exaggerating! There are some creepy arms in that image, along with the other weirdness. I'm surprised Karpathy of all people used such a poor quality image for such a post.
Yeah, wow. And it's a lovely post otherwise, too!
An insidious feature of machine generated content is that it will dominate not because it’s better, but because it’s cheaper.
You mean to tell me this doesn't convincingly look like people working on their laptops on trestle tables in the forest at dusk?
Last time I worked on my laptop on a trestle table in the forest at dusk it looked almost exactly like this.
I think AI images can be very nice, I like to use them myself. I don't use images I don't personally like very much. So if you don't like them, it is not because, AI, it is because your taste and my taste don't match. Or maybe you would like them, if you didn't have a bias against AI. What I love about AI images is that you can often generate very much the thing you want. The only better alternative would be to hire an actual human to do that work, and the difference in price here is huge, of course.
It is like standing in front of a Zara, and wondering why people are in that shop, and not in the Versace shop across town. Surely, if you cannot afford Versace, you rather walk naked?
Does AI engineer == API Engineer?
The P is silent
If the P is silent, how can you "Prompt"?
Rompt. Means "to break" in French:
https://en.wiktionary.org/wiki/rompt
Remove both the P's and you get rømt, which is Norwegian for sour cream.
Insert your own joke here about taking the Ps, it's bound to be better than what I'll come up with.
How many silent P's are in the word "prompt"? Let me ask ChatGPT...
Just for fun, I did ask ChatGPT:
There’s one silent "p" in the word "prompt"—right at the beginning! The “p” isn’t pronounced, but it sneaks into the word anyway.
Normally, I'd just dismiss this as ChatGPT being trained not to disagree, but as a non-native speaker this has me doubting myself: Is prompt really pronounced rompt?
It feels like it can't possible be true, but on the other hand, I'm probably due for having my confidence in my English completely shattered again by learning another weird word's real pronunciation, so maybe this is it.
I'm a native english speaker, and prompt is pronounced "promt" ( /prɒmt/ in my roughly General American accent). Ie, there is a silent "p", but it's the second one, not the first.
There's no silent p in prompt. https://en.wiktionary.org/wiki/prompt
I am not a native English speaker either, but I am fairly certain it's pronounced "promt", with the second "p", the one between the "m" and the "t", merging into those sounds to the point of being inaudible itself.
Also, I too asked ChatGPT and it told me that In the word "prompt," there are no silent P's. All the letters in "prompt" are pronounced, including the P.
as it is in "swimming"
The only remaining question being, why would you want to?
Funny enough, Helix doesn't know either! They put together a contest hoping that you'll figure it out: https://blog.helix.ml/p/llm-app-challenge-with-helix-10
You might say this is about Helix being small and trying to break into a crowded market, but OpenAI and Google offered similar contests / offers that asked users to submit ideas for LLM applications. Considering how many LLM sample apps are either totally useless ("Walter the Bavarian, a chatbot who gives trivia about Oktoberfest!") or could be better solved by classical programming ("a GPT that automatically converts currencies to USD!), it seems AI developers have struggled to find a single marketable use case of LLMs outside of codegen.
Codegen, contentgen, and driving existing products in human language (which you can largely bucket into interactive codegen)
Soon enough we'll have AI that is just integrated into the OS.
So individual apps don't need to do anything to have AI.
What does it mean for an app to "have AI"?
At a very minimum I'd say they'll have a way to "chat" with the apps to ask questions / do stuff. Either via APIs that the OS calls or integrated in the app via whatever frameworks will rise to handle this. In 5 to 10 years we'll probably see this. At a very minimum searching docs and "guide" the users through the functionality / do it straight up when asked.
Basically what chatgpt did for chatbots, but at app level. There are lots of apps that take a long time to master. But the average joe doesn't need to master them. If I want to lightly edit some photos, I know photoshop can do it, but I have no clue where that specific thing is in the menus, because I haven't used it in 10 years. But it would be cool to type in a chat box "take all the pictures from my sd card, adjust the colors, straighten the ones that need it, and put them in my Pictures folder under "trip to the sea". And then I can go do something else for the 30-60 minutes it would have taken me to google how to do all of that, or script something, etc.
The ideea of an assistant that can work like that isn't that far-fetched today, IMO. The apps need to expose some APIs, and the "os" needs an language -> action model capable enough to handle basic stuff for average joes. I'd bet good money sonnet3.5 + proper APIs + a bit of fine-tuning could do it today for 50%+ of average user cases.
If it makes any kind of decision whatsoever (like an "if" statement), slap the word AI on it.
Think of it as another human having access to the keyboard, mouse and the screen buffer.
Assuming an Engineer degree to start with.
I still don't see how AI replaces the understanding of what a server is, what DNS is, what HTTP is, or what...
I could go on and on.
Copy paste is great until you literally dont know where you are copy and pasting
An AI engineer with some experience today can easily pull down 700K-1M TC a year at a bigtech. They must be unaware that the "barriers are coming down fast". In reality it's a full time job to just _keep up with research_. And another full time job to try and do something meaningful with it. So yeah, you can all be AI engineers, but don't expect an easy ride.
I run an ML team in fintech, and am currently hiring. If a resumè came across my desk with this "skill set" I'd laugh my ass off. My job and my team's jobs are extremely stressful because we ship models that impact people's finances. If we mess up our customers lose their goddamn minds.
Most of the ML candidates I see now are all "working with LLMs". Most of the ML engineers I know in the industry who are actually shipping valuable models, are not.
Cool, you made a chatbot that annoys your users.
Let me know when you've shipped a fraud model that requires four 9's, 100ms latency, with 50,000 calls an hour, 80% recall and 50% precision.
I mean, sure, anyone can cobble together Ollama and a wrapper API and an adjusted system prompt, or go serious with Bumblebee on the BEAM.
But that's akin to web devs of old that stitched up some cruft in Perl or PHP and got their databases wiped by someone entering a SQL username. Yes, it kind of works under ideal conditions, but can you fix it when it breaks? Can you hedge against all or most relevant risks?
Probably not. Don't put it your toys into production, and don't tell other people you're a professional at it until you know how to fix and hedge and can be transparent about it with the people giving you money.
We can all be janitors too, so what?
Just...boring.
looks interesting I'll have to check it out.
thank you for letting us know. i was wondering if you found it interesting and would check it out