LLMs consistently pick resumes they generate over ones by humans or other models

(arxiv.org)

254 points | by laurex 2 hours ago ago

105 comments

Anecdata, sample size of one:

When I was looking for my next role after being laid off, I didn’t get much of a response with my human handmade resume despite my experience

Just for kicks, I asked ChatGPT to “Analyze my resume and give it a score for what percentage it was in” then I asked it to revise it to make it score as high as possible

I still tweaked and fact checked it but after I started sending that out, I got a much higher hit rate than before

But who knows, maybe the market changed, was a better time of year, etc

I still had to pass interviews and prove my worth. But it probably helped me get my foot in the door

[-]

leonidasv an hour ago

Same thing happened to my wife as well. I helped her tailor her LinkedIn profile and resume with a lot of attention to detail: adding metrics, keywords, results, etc. Nevertheless, she never received any outreach recruiters and got very few application responses. It went like that for months, almost a year.

Then she asked ChatGPT 5.x for help. I was skeptical about the changes it recommended (and was skeptical at all about using AI for this given the homogeneification it tends to produce). But somehow it worked: few days later, a recruiter reached out, then another, then applications started moving forward, etc.

My guess is that, as LLMs are shoveled into every phase of the recruiting process, not having an LLM write your resume for you is now playing on hard mode. The LLMs reviewing resumes are downranking resumes and profiles that are not "speaking" the same language and activating the correct neurons, thus preventing you from moving forward. This contrasts with years ago when we had more humans in the loop and the pasteurised writing of GPT 3.5/4o would make you look less worthy. Again, just a theory, but...

[-]

andsoitis 14 minutes ago

> I helped her tailor her LinkedIn profile and resume with a lot of attention to detail: adding metrics, keywords, results, etc.

FWIW, when I see a resume with metrics and keywords, I immediately filter it out.

[-]

mikeyouse 9 minutes ago

Which is a very “HN” sentiment when the vast majority of recruiters and hiring managers are absolutely not doing the same. Especially for roles outside of tech.

[-]

andsoitis 7 minutes ago

Yeah I don’t know what others are doing, but I work in the valley and those elements signal checklist mentality. To wit, those keyword lists often include, in my experience, proficiency in specific tool use, rather than communicating skills that transcend tools, which tells me the person is likely not very dynamic or creative.

hiAndrewQuinn 7 minutes ago

What counts as a keyword here? If you're hiring for a frontend developer and you see e.g. "Redux" do you just can it?

[-]

andsoitis 4 minutes ago

Knowing or having experience with Redux isn’t going tort me luv you over someone else who doesn’t list it for a job where I’m paying you hundreds of thousands of dollars. I look at other skills.

I would not can it in isolation, but if I see a comma-separated list like: “proficient in redux, react, html, JavaScript, sql, kubernetes, word and excel”… then yes, you don’t make the cut.

tkiolp4 3 minutes ago

Metrics: I increased retention 2x; I reduced latency from X ms to Y ms; increased slo to 99.999… those are all meaningless. It was in fashion to put such numbers in cvs maybe 5-10 years ago. Not anymore

reillyse 10 minutes ago

You must be a pre 5.0 model so.

j45 31 minutes ago

Having implemented more than a few applicant tracking systems, too many are so anchored in the past, that they would probably try to boil the ocean at once by letting AI loose on it, leaving an ability for ai resumes to ai applicant tracking systems.

The key insight here is humans are responsible for improved articulation to the ai, who in turn will improve the rest, and that can be as detailed and informative, and educational as the human likes.

newsy-combi 32 minutes ago

Kafkaesque

davebren 8 minutes ago

Probably gonna get downvoted for this, but when you give an anecdote you don't have to preface it with "anecdata, n=1 sample size".

We know it's from your individual experience because it's a story about your individual experience. We've been doing this for all of human history. This is some kind of strange milieu of trying to always sound scientific, or it's fear of the "well akshually I'm gonna need to see a random placebo controlled trial", which is equally annoying.

fuzzy_biscuit an hour ago

I've done as you described and then edited it down to sound human again.

Esophagus4 an hour ago

There are services that will do this as well - I’ve used them both on my LinkedIn and resume with decent success.

amelius an hour ago

I suppose the HR folks gave you a "+1 knows how to use AI".

[-]

grey-area an hour ago

It seems more likely the HR people depend on LLMs to do the job of screening and LLMs unsurprisingly prefer LLM output and rank it highly.

It’s not lazy incompetence, it’s quietly getting the job done with 1% of the effort (that was a sarcastic pastiche, in case anyone was unsure).

[-]

zdragnar 28 minutes ago

It's not uncommon to get hundreds or thousands of applications per opening for web tech, if the position is advertised on LinkedIn or a similar job board.

They'd need to use some automation, even if it is just picking ten at random.

ben_w an hour ago

Some will, others openly say on the job ad they will fail you for using AI.

[-]

izacus an hour ago

And then still use a CV scanning service that rejects non-AI resumes.

dawnerd an hour ago

I know if I got a resume from someone that had obviously used AI to generate it, it would be a pass.

[-]

drillsteps5 44 minutes ago

Before the resume ends up in the hiring manager's inbox it needs to be picked by the recruiter from literally hundreds of others. The recruiter uses HR software to determine the match (usually the percentage), and then picks top 5% or top 20 or whatever highest ranked resumes.

Guess what's doing the ranking.

bell-cot an hour ago

What if your own HR's LLM didn't send you any other kind?

tayo42 31 minutes ago

Llms were good for being objective and helping cut out stuff from mine. Harder to do when you personally think everything you ever did is important.

[-]

davebren 3 minutes ago

It actually is important and if I was hiring you I'd find it useful to get a more comprehensive understanding of your experience, especially if there's something I'm aware is a very challenging problem to solve. And it would provide more things to cross-examine in interviews to make sure it's not fake. The idea that people hiring are saving time by not reading an extra resume page when deciding on someone that will hopefully work there for years is ridiculous.

For some reason that's the minority opinion because everything has to be dumbed down now.

hyperpape an hour ago

I'll copy what I wrote on LinkedIn (note: I read roughly 25 pages, which is half the paper, and read it quickly)[0]:

"If I read the paper correctly, they don’t actually show that LLMs prefer resumes they generate.

Their actual method seems to be taking a human written resume, deleting the executive summary, having an LLM rewrite the executive summary based on the rest of the resume and then having another LLM rate the executive summary without the rest of the resume.

That’s likely to massively overstate any real impact, if you can even rely on it capturing a real effect.

I really wonder if I read that correctly, because I can’t come up with a justification for that study design."

[0] I couldn't help but mildly copy-edit before pasting here.

[-]

b112 34 minutes ago

Could be an ad for 'use LLMs more'. A generic ad like this helps all in the market, but if you own 30% of LLM market share, it still helps you 30% of the time.

Now that I think of it, every other industry has an 'advocacy group', whether cheese, oil, or nutmeg. So surely there is now some sort of LLM 'consortium', and group funding studies like this just fuels the FOMO. You can be sure such groups exist, and are pummeling every government in the world thusly. But I bet they're also looking here.

After all, it's a circle. Uh-oh! HR is using LLMs, you'd better too potential employee! Then later? Uh-oh! The best employees you can hire are using LLMs, you'd better too HR!

They already FOMOed us into basically everything else, why not LLMs too?

ivansmf 7 minutes ago

I suspect the entire industry uses "auto-raters", where an agent instance is used to scores the agent's output. The idea is similar in intent as using adversarial networks to train image generation, minus the human labelers. Raising the scores of the auto-rater then becomes the metric teams optimize, and it is no wonder the end result is that the agent scores its own generated content the highest.

aykutseker 3 minutes ago

The uncomfortable part is that this is probably rational behavior for both sides.

Employers use models to filter resumes, candidates optimize resumes for those models, and suddenly the resume is no longer written for a human at all.

bendergarcia an hour ago

We are without our consent introducing a party in between people. The models become the arbiters of who does and does not get a job. It feels problematic.

[-]

justonceokay 14 minutes ago

There will be a great arbitrage for people who do not use LLMs.

If your HR department is using ChatGPT to filter resumes, you’ll end up with people who used ChatGPT to generate resumes. I don’t want to make a “slippery slope“ argument, but my gut feeling is that the quality of your organization will deteriorate quickly.

On the other hand, I am a handyman/subcontractor. Almost all of my work comes through phone calls, texts, and one-off emails. I only work with people that are recommended by a trusted sources. I haven’t handled a traditional resume (mine or other people’s) in over eight years.

If I started interacting with somebody and they seemed like they were a computer, that would be the fastest way for me to know I should move on to another client. If they can’t take the time to interact with me, how am I supposed to perform hundreds of hours of physical labor for them?

bendergarcia an hour ago

And I feel the common response of: well just use the model that’s available. Ai is and will probably always be resource constrained and profit driven, that means we will eventually see a world where poor people have worse resumes than rich people and there really won’t be any way around it because the man in the middle has the final say

[-]

adrianN 32 minutes ago

Not too long ago I bet resumes that were printed from a computer were preferred to resumes typed on a typewriter. What happened was that computers became commodities. It is reasonable to assume that LLMs will become commodified too.

[-]

YurgenJurgensen a minute ago

That would hardly be surprising. Monospaced fonts make natural language a pain to read, so what that would prove is that well-presented resumes are preferred to poorly-presented ones.

This case is different, as the LLM output isn’t measurably better than the human output (unless you have a particular love of bland corpo-speak).

Nuzzerino a minute ago

[delayed]

falcor84 18 minutes ago

The ship has sailed as soon as hiring managers stopped reading cv's directly and we got recruiters as a profession.

ekianjo an hour ago

before it used to be HR, so you always had a party in between "actual" people. HR (mostly) never cared about the CV, they just look at a checklist and see if it matches.

sneak an hour ago

We already did that when we all created LinkedIn accounts.

sxg an hour ago

Take a look at how things worked before (and still do): employers decide who get jobs based on a combination of personal biases, nepotism, and ulterior motives while applicants present distorted versions of themselves and network/pull strings to put the odds in their favor. That seems more problematic.

benashford an hour ago

Intuitively this feels obvious. Content generated by the model will be shaped by its training, therefore when reading it back it will resonate with that same training and have a positive view as a result.

Human when preparing a CV: "Make my CV more professional"

LLM many days later presenting a report to HR: "This CV is really professional"

There's probably more to it than that of course.

But it justifies my personal policy of using a different LLM family for code review tasks than for code generation tasks. To avoid the "marking your own homework" problem.

[-]

gzread 12 minutes ago

And not in human-interpretable ways. An LLM was told to behave in a certain way and then output random numbers. When the numbers were pasted to another LLM instance, it also behaved that way. I wish I remembered more about that study or had a link to it - it was fascinating.

[-]

mnicky 2 minutes ago

Wasn't it this one?

Article: https://alignment.anthropic.com/2025/subliminal-learning/

Paper: https://arxiv.org/abs/2507.14805

AlexB138 an hour ago

This may lead to some interesting gamesmanship. For instance, if I am applying to a company, and I know they use a certain applicant tracking system, and I know that ATS uses a certain model provider for its filter, I should then use that model to write the version of my resume I send to the company.

rogermarley an hour ago

I think resumes will eventually (or have already) become obsolete in tech. The SNR is so low, they offer very thin filtering value.

Even taking the tiny bits of the resume that are "hard signal", like GPA, certifications, prior roles, etc, it doesn't translate into their performance in the initial screening interview.

This is why what I think the industry sorely needs is examination consortia.

Rather than trying to guess capability from the name of the university they went to, leading tech companies creating standardized tests in various fields, and your test scores form your "resume", so that developers can just focus on improving their scores rather than wasting time on resume/application/repetitive-screening toil.

[-]

aDyslecticCrow 28 minutes ago

> standardized tests in various fields

This is itself a massively difficult problem. Standardised tests are bad indicator of topic understanding. (setting aside the massive incentive for blatant cheating)

You're effectively advocating for leetcode being effective hiring tool, which many would highly criticize.

indiv0 an hour ago

Eventually even a system like that can be gamed, similarly to how Leetcode-maxxing and the like sprung up in response to typical SV interview questions. Studying for the job becomes studying for the test becomes studying for the pre-test test.

drillsteps5 an hour ago

That's what people on both side have been doing for at least couple years already.

Recruiters scan resumes for the best match with LLMs, candidates use the same LLMs (there's only like 3 of them) to tweak their resume for better match. I don't know what research you need to see why that makes sense.

[-]

yagi0x00 40 minutes ago

This indicates that resumes created by the same model may have an advantage over those created by other model, so I suppose technically you may have a small advantage if an insider tells you the resume parsing tool is powered by Gemini as opposed to the other models.

My broader discomfort is that we are still learning about model biases while human biases are arguably better understood, and I don't like the ethics of rejecting a person based on criteria I don't fully understand.

aDyslecticCrow 25 minutes ago

It further makes expecting or spending the effort hand writing a proper introduction useless. Which then undermine the entire purpose of it.

visarga an hour ago

When classifying resumes it is better to use the LLM as a feature extractor, think of 10-20 features you base your decision on, and extract them by LLM. The LLM only needs to do lower level task of question answering. Then you fit a classical ML model (xgboost for example) on the extracted features, based on company triage data points. This way you don't rely on the biases in the model, you can decide what criteria to use and how to judge cases without retraining the LLM. The feature extractor is generic, and the actual triage model is a toy you can retrain in seconds on new data points. It is also much more explainable, you can see how features influence decisions.

[-]

aDyslecticCrow 23 minutes ago

I'd rather my employers just does the classic of shredding random 80% and looking at the remainder properly.

analog8374 3 minutes ago

This means that LLM human resource departments will only hire LLMs. Which is kind of beautiful.

ilia-a an hour ago

Seems kinda obvious, given that most large recruiting firms/hr use algos to analyze resumes and AI written version likely do a better job at hitting keywords/structure algos/llms pick up on...

embedding-shape an hour ago

You'll find the same is true if you have two different LLMs first independently come up with a plan for an implementation, then ask each one of them to say which one of the two designs/plans are the best. They're much more likely to favor the plans generated from the same model, rather than from other models. I'm sure, internally, this somehow makes sense, but it's worth thinking about if you're doing the whole "ask N models for voting/rating N plans to find the best" charade.

[-]

SeriousM an hour ago

That's why I let the LM write it's own AGENT.md or SAFESPOT.md because it "knows" best how to write it so it can resume next time without issues.

Is hits the same spot as that I would take other notes than anyone else and no one could follow them as easily than I do. Everyone leaves the "of course" parts out of the notes if it's for the own use.

logicalfails an hour ago

I suspect this is more a function of the corporate sanitization of language within the models. When I have passed my resume through the models for refinement, it often sanitizes some of the more easy going or simpler wording. It expands the vocabulary, makes it more dense, and uses more corpo speak in the bullets and formatting.

Each model likely has its own biases in terms of what constitutes correct corporate speak, and it chooses the resumes that best fit this. Ultimately, I suspect it's more a function of model saying "this grammer, syntax structure, and formatting is most aligned with what is correct corporate language, so flag as high quality".

sb057 an hour ago

Well yeah, LLMs generate resumes (and other text) that they judge as superior to alternative plausible texts. Why would that judgement change just because a different instance hasn't seen it before? To anthropomorphize it, it's like having a hiring manager write a resume, get amnesia, and then have to judge it among other resumes.

[-]

Ekaros an hour ago

Seems like obvious thing. If LLM have some weights involved on what is good resume to write there is very likely correlation to what would be good resume to rate. And this is probably a even good thing, at least from model quality perspective. Model itself should rate highly whatever it produces. There should be correlation between output and review of same output.

bendergarcia an hour ago

I wouldn’t put it past these tech companies to prefer ai outputs to encourage ai inputs

ryeguy_24 an hour ago

Does anyone know of any HR departments actually using LLMs for scoring, selection, extraction, classification or any real use cases? I'm curious to hear about it and how they are using it.

[-]

oogetyboogety 36 minutes ago

We were told by hr NY has strict state laws against this

redbonsai an hour ago

There's an AI layer built into most ATS systems as well as LinkedIn and Indeed

[-]

ryeguy_24 an hour ago

Could you share more detail on how the AI layer is used? Is it an LLM?

jimnotgym an hour ago

I just guessed that and got Copilot to rewrite my profile on the internal HR system. I also got a job spec benchmarked higher by getting Copilot to write it with that exact aim given in the prompt

[-]

fecalmatter an hour ago

i straight up lied about my work experience

we are exactly the same

abubakir1997 14 minutes ago

Very interesting.

mpurbo an hour ago

At this point, all these are becoming almost like comedy.

jamiecurle an hour ago

disclaimer: Not a lawyer, but studying towards CIPP/E.

You'd make no friends doing it, but as I understand it, for those that have GDPR as a statutory right then under "[Article 22 - Automated individual decision-making, including profiling][0]" you can request to know if your CV was screened by AI and what (and this is key) "meaningful human interaction" led to that decision. Technically this falls under a data subject access request and so a response is mandatory (but who really is going to enforce that - ICO / <insert your data protection agency here> probably isn't). Companies can't just smash a button and claim meaningful interaction, it has to be, well, meaningful and smashing a "nope" button obviously isn't meaninful.

If it turns out that it was only AI that screened it you can request a human review. Do not hold your breath.

Again, you'd make no friends doing it, but sooner or later a test case will emerge to generate some case law around "AI said no" because employment, or lack of because AI says no, does have significant impact on a human.

[0]: https://gdpr.algolia.com/gdpr-article-22

[-]

noprocrasted an hour ago

The issue is that indeed, nobody is going to enforce that.

einpoklum an hour ago

> As artificial intelligence (AI) tools become widely adopted, large language models (LLMs) are increasingly involved ... [in] ... decision-making processes

That's the problem right there.

[-]

bendergarcia an hour ago

Absolutely! I don’t think people are really considering the full effects of just letting ai be the middle man. I mean Sam Altman basically said this is what he wants Gwen he said intelligence is a commodity no?

makeitrain 2 hours ago

Vibe resume?

[-]

masfuerte an hour ago

Aka VCV.

alexgotoi 2 hours ago

I was about to write the same thing…

bjourne an hour ago

The only test that has worked 100% of the time for me is to read the candidate's code. Two hours is enough to precisely estimate the candidate's qualities as a software developer. I never understood why companies waste time with tests and quizzes because since it is so easy for me it should be just as easy for other software developers too. Of course, a candidate may be a jerk or unfit for other reasons, but ranking them on a software developer hot-or-not scale is not very difficult.

[-]

noprocrasted an hour ago

Just like they'll send you an LLM'd resume, they will send you LLM'd code.

[-]

bjourne 31 minutes ago

Conceptually no different from copy-pasting someone else's code.

parentheses an hour ago

Reading only the abstract: LLMs prefer output of their own generation over humans or even other models.

This is a very good reason to avoid using model-generated data to train future models. We'd be deepening this bias by continuing to do that, essentially forcing society to reshape their output using LLMs to increase engagement. This feels like a form of enshittification that doesn't just touch one product but all of society.

booleandilemma 33 minutes ago

HR departments aren't using LLMs to select candidates for jobs are they?

interstice 30 minutes ago

"I'm not just good, I'm amazing"

jonahs197 an hour ago

Will people snap over this?

bdangubic 38 minutes ago

My new CV contains 37 emdashes

Der_Einzige an hour ago

This is extremely obvious to anyone whose read other papers. There's tons of papers showing LLMs prefer their own outputs. It's a big enough problem that LLM-as-judge has to be a different LLM from the LLM you are testing in papers.

jqpabc123 an hour ago

Repeat after me --- it makes no sense to try and prompt a language prediction engine to display good judgment.

randomdrake 2 hours ago

I wonder if this extends to training models on new content as well. Are we creating a cyclical information-consumption and training situation in which models being trained are more likely to pick up on and reference content created by themselves or by other LLMs than by other humans?

johndhi 2 hours ago

Another way to phrase this might be that LLMs make better resumes no?

[-]

budoso 2 hours ago

If that were the case they would select the ones generated by other models at a similar rate to the ones they generated themselves.

delecti 2 hours ago

You'd have to define "better".

All this shows is that LLMs generate resumes that fit the heuristics LLMs use to judge resumes. And that makes sense, but isn't necessarily a given.

rectang an hour ago

By one metric, yes!

If you are a candidate who wants to be hired, and your target employers use LLMs to filter resumes, then an LLM-generated resume that the employer LLM-powered resume filters favor is "better" — as in "more likely to get you the job".

mrktf 2 hours ago

Or in other words: LLM it is optimizing function which is generated by same LLM, think you have random variable y, where generator sin(x+r) and your optimizer trying to fit function sin(x+unkown1) + unknown2 ("unknown" function) - it is obvious that will find best fit.

jezzamon 2 hours ago

In text generation, LLM language is full of very emphatic phrases. At a surface level it might sound stronger. But as a human reader, it's not necessarily better

mathgeek an hour ago

*for getting past ATS reviews.

Emanation an hour ago

Where I work, my boss decided to make an application that uses AI to score long text field entries to ensure required information is present.

The AI lacks the ability to extract nuance and implicit information, which means entires end up being long winded and repeatitive. For each requirement its looking for, it must be explicity expressed-- it's quite unnatural, and almost feels like solving a puzzle, to which the obvious solution is to write a comment, then give it and the AI feedback to a failing comment to AI, so it can generate the proper structure the rubric-AI is looking for.

LLMs are statistically driven, and I can only imagine having the AI rewrite the comment produces a result that's more statistically fitting to the model than if any given human were to write it. So, it might mean, yeah, LLMs are better at writing resumes that the LLM can successfully classify-- are they better for a human to consume? Who knows.

nottorp 2 hours ago

Easy then. Apply N times, each time with a resume generated by a different LLM.

No human is going to notice anyway. Or add a N+1 resume written by yourself in which you describe your strategy, just in case.

[-]

zipy124 2 hours ago

Do you really believe no human is going to read your resume at some point in the process and notice the classic AI tells?

Further de-duplication is rather easy, and will likely see you black-listed by competant organisations.

[-]

stingraycharles an hour ago

“Do you really believe no human is going to read your resume at some point in the process and notice the classic AI tells?”

Even here on HN many people don’t recognize AI tells that are obvious. Pretty much 100% of all articles posted on HN have been AI generated for months and months already and people don’t seem to care.

I have very little faith in humanity being able to deal with the chaos that LLMs are going to unleash on society.

Heck, most resumes are probably skimmed at best already.

nottorp 26 minutes ago

In organizations where LLMs sort the resumes yes, I believe no human will read my resume until it's too late.

cl0ckt0wer 2 hours ago

The only resumes that make it past the ai to a human are ai generated

[-]

Esophagus4 an hour ago

When I’m hiring, a human recruiter (or the hiring manager) reads most resumes.

For us, there is some sorting by basic keyword analysis and we start near the top, but there is no proverbial black box that rejects candidates outright.

If candidates are ignored by humans, it’s not because AI rejected them, it’s because we are starting with candidates earlier in the list and might not make it to applicant 537.

zipy124 an hour ago

Rather unlikely to be the case, supported by the original article itself here, since if your statement was to be the case they would find that the human generated resume is 100% less likely to be shortlisted.

[-]

stingraycharles an hour ago

Obviously it’s not 100% of all human resumes are going to be filtered out, but it’s quite damning that human resumes are more likely to be filtered out just because they didn’t LLM-ify it.

stingraycharles an hour ago

You don’t understand the problem.

Companies are using AI / LLMs to pre-filter resumes. These AIs prefer their own slop resumes. Not just human vs LLMs, but Claude prefers Claude resumes over ChatGPT. Nothing good can come out of that, when resumes are pre-filtered like that.

Unless, of course, you’re not being serious and just trying to be edgy on HN.

[-]

DiscourseFan an hour ago

Why would I want to work for a company where all the employees made slop to get hired by slop to do slop? It’s slop all the way down!

[-]

stingraycharles an hour ago

Because this is where the industry as a whole is moving towards, and you don’t want to be out of a job I presume.

almostdeadguy 2 hours ago

Happy for everyone trying to invent SEO hacking for resumes.

idopmstuff an hour ago

Even if we take this to be true, I'm not sure that it really matters?

It's comparing two resumes with the same information and picking one of the two. That's obviously a situation that would never occur in actual hiring. This doesn't demonstrate anything at all that indicates that LLMs would incorrectly preference LLM-written resumes in the real world.

It'd be interesting to do the same thing but with two resumes that are almost identical. One is slightly better (an extra year of experience or a specific note of some skill that is relevant to the role), and the other slightly worse one is written by an LLM. If the reviewing LLM picks the worse one in that case, you're potentially establishing a bias that would matter. As it stands this experiment just seems contrived and pointless.