AI Self-preferencing in Algorithmic Hiring: Empirical Evidence and Insights

(arxiv.org)

298 points | by laurex 3 hours ago

41 comments

hyperpape 2 hours ago
I'll copy what I wrote on LinkedIn (note: I read roughly 25 pages, which is half the paper, and read it quickly)[0]:
"If I read the paper correctly, they don’t actually show that LLMs prefer resumes they generate.
Their actual method seems to be taking a human written resume, deleting the executive summary, having an LLM rewrite the executive summary based on the rest of the resume and then having another LLM rate the executive summary without the rest of the resume.
That’s likely to massively overstate any real impact, if you can even rely on it capturing a real effect.
I really wonder if I read that correctly, because I can’t come up with a justification for that study design."
[0] I couldn't help but mildly copy-edit before pasting here.
Edit: yes, the authors present a reason for their design, and an ideal version of my comment would've said that. I do not consider it much of a justification. See below: https://news.ycombinator.com/item?id=47987256#47987727.
[-]
- b112 1 hour ago
  Could be an ad for 'use LLMs more'. A generic ad like this helps all in the market, but if you own 30% of LLM market share, it still helps you 30% of the time.
  Now that I think of it, every other industry has an 'advocacy group', whether cheese, oil, or nutmeg. So surely there is now some sort of LLM 'consortium', and group funding studies like this just fuels the FOMO. You can be sure such groups exist, and are pummeling every government in the world thusly. But I bet they're also looking here.
  After all, it's a circle. Uh-oh! HR is using LLMs, you'd better too potential employee! Then later? Uh-oh! The best employees you can hire are using LLMs, you'd better too HR!
  They already FOMOed us into basically everything else, why not LLMs too?
- delusional 2 hours ago
  [flagged]
  [-]
  - aDyslecticCrow 2 hours ago
    There is some creativity in the rest of the CV, between what kind of experiences are included and how they are described. But that would be far harder to generate fairly.
    In think choosing the summary is a fair design choice since it prevents the LLM from just... making up a perfect candidate.
    "I'm a fullstack professor of software design with 90 years of experience expecting a junior internship position"
  - nearbuy 2 hours ago
    I assume they meant they can't come up with a reasonable justification.
    [-]
    - hyperpape 1 hour ago
      Thank you, that's correct.
      To be perfectly clear, I understand their justification for only _editing_ the executive summary, it is arguably reasonable, because editing the work history would risk altering the details in ways that compromise the measurement. This is a hard problem to solve (you might try reviewing the resumes for hallucinations, but I can't think of a precise study design that doesn't risk problems).
      What is, imho, impossible to defend, is having the LLM only evaluate the executive summary in isolation, and reporting that as it preferring resumes it wrote.
      What you've shown is that LLMs prefer executive summaries they wrote. But the overall impact on how they will evaluate your entire resume is not measured by this technique.
      Worse, this isn't just "decent paper, bad summary", their abstract misreports their findings.
    - delusional 1 hour ago
      I doubt it since they, admittedly, didn't read it. The question he posed, about the paper, is answered in that very same paper. He has structured his whole reply to have the tone of uncovering the hidden caveat in the small print that invalidates the paper, when it's actually a straightforwardly stated assumption in their methodology section.
  - ekianjo 2 hours ago
    > They state that unlike the rest of the resume, which is largely factual
    largely factual? A resume is usually more than a bunch of dates and titles of positions.
charliebwrites 2 hours ago
Anecdata, sample size of one:
When I was looking for my next role after being laid off, I didn’t get much of a response with my human handmade resume despite my experience
Just for kicks, I asked ChatGPT to “Analyze my resume and give it a score for what percentage it was in” then I asked it to revise it to make it score as high as possible
I still tweaked and fact checked it but after I started sending that out, I got a much higher hit rate than before
But who knows, maybe the market changed, was a better time of year, etc
I still had to pass interviews and prove my worth. But it probably helped me get my foot in the door
[-]
- leonidasv 1 hour ago
  Same thing happened to my wife as well. I helped her tailor her LinkedIn profile and resume with a lot of attention to detail: adding metrics, keywords, results, etc. Nevertheless, she never received any outreach recruiters and got very few application responses. It went like that for months, almost a year.
  Then she asked ChatGPT 5.x for help. I was skeptical about the changes it recommended (and was skeptical at all about using AI for this given the homogeneification it tends to produce). But somehow it worked: few days later, a recruiter reached out, then another, then applications started moving forward, etc.
  My guess is that, as LLMs are shoveled into every phase of the recruiting process, not having an LLM write your resume for you is now playing on hard mode. The LLMs reviewing resumes are downranking resumes and profiles that are not "speaking" the same language and activating the correct neurons, thus preventing you from moving forward. This contrasts with years ago when we had more humans in the loop and the pasteurised writing of GPT 3.5/4o would make you look less worthy. Again, just a theory, but...
  [-]
  - andsoitis 1 hour ago
    > I helped her tailor her LinkedIn profile and resume with a lot of attention to detail: adding metrics, keywords, results, etc.
    FWIW, when I see a resume with metrics and keywords, I immediately filter it out.
    [-]
    - schrodinger 58 minutes ago
      Same.
      If it's something like "Refactored the apartment list service improving P99 Latency from 2s to 180ms", it definitely boosts the resumé in my mind. A good engineer would be measuring their impact and likely have numbers like that off the top of their head.
      But if it's like "Increased revenue by $18.7M by reducing time-to-first-interaction latency from 2.3s to 117ms, increasing conversion by 47% and LTV by 28%," with the same fidelity on each bullet, I'm very skeptical.
      --
      I don't summarily reject AI-written resumés to be clear, as honestly, it's basically a necessity at this point to be competitive with others; it'd be putting yourself at a severe disadvantage on pure principles in a way that has no real positive net effect on society. Even if you disagree with AI resumé screeners, you're only hurting yourself — especially at a time that has the largest impact on your compensation (i.e. negotiating salary at job start is one of the most valuable ways to spend your time since it will pay you back every paycheck).
      Though I _do_ tend to question resumés that look like they were written almost entirely by an LLM without the candidate providing significant context and refinement.
      [-]
      - jwolfe 38 minutes ago
        > If it's something like "Refactored the apartment list service improving P99 Latency from 2s to 180ms", it definitely boosts the resumé in my mind. A good engineer would be measuring their impact and likely have numbers like that off the top of their head.
        > But if it's like "Increased revenue by $18.7M by reducing time-to-first-interaction latency from 2.3s to 117ms, increasing conversion by 47% and LTV by 28%," with the same fidelity on each bullet, I'm very skeptical.
        Do you mind explaining why? The former doesn't indicate caring about business impact whatsoever (is this service in the critical path of any online process? Who knows!) while the latter does.
        [-]
        dolebirchwood 9 minutes ago
        Because the latter's "business impact" is clearly made-up bull shit?
      - nerdsniper 50 minutes ago
        I wish it was at least normalized to submit two resumes - one for AI and one for humans. Threading the needle to please both audiences is such a crap-shoot.
    - mikeyouse 1 hour ago
      Which is a very “HN” sentiment when the vast majority of recruiters and hiring managers are absolutely not doing the same. Especially for roles outside of tech.
      [-]
      - andsoitis 1 hour ago
        Yeah I don’t know what others are doing, but I work in the valley and those elements signal checklist mentality. To wit, those keyword lists often include, in my experience, proficiency in specific tool use, rather than communicating skills that transcend tools, which tells me the person is likely not very dynamic or creative.
        [-]
        Izkata 1 hour ago
        > those keyword lists often include, in my experience, proficiency in specific tool use
        This used to be called "buzzword bingo" and was pretty much required. It was how you got past the initial automated filtering step before a human even saw your resume.
        [-]
        andsoitis 1 hour ago
        I don’t know whether it was ever effective strategy for candidates, but I will simply say that as a hiring manager for over 12 years, I have never been interested in anyone’s resume when I see that.
        [-]
        schrodinger 56 minutes ago
        As someone who's been a hiring manager for around 7 years, I agree with you, but note that the people who screen resumés before they even _get to you_ very well may be looking for those references.
        For my own resumé, I include the stack used at each job which I feel strikes a fair balance.
        [-]
        mcv 9 minutes ago
        That's what I always did too. Then I removed it because I wanted to focus more on the kind of problems I solve rather than the languages I've worked in, and recruiters complained, so I put it back in.
        Melatonic 35 minutes ago
        Most HR departments have been filtering resumes (or LinkedIn) based on things like keywords for years before they got to you. So your reaction to resumes that heavily use those may be reactionary to being presented with tons of those (by whoever filtered them before you)
        rustystump 35 minutes ago
        No used to be. It still is standard. Large companies that do not use external recruiters still use keywords and skills matching to find candidates and it drives me nuts.
        mcv 11 minutes ago
        I rewrote my resume in a way that sounds like exactly what you want: focus on skills that transcend tools instead of just the tools, and every recruiter asks me about tools.
        david-gpu 1 hour ago
        Please, do show us your resume so that we can judge the heck out of you as well. It is a fun game, apparently.
        [-]
        andsoitis 1 hour ago
        I’m giving advice that I judge to be useful.
    - RajT88 1 hour ago
      Same. I am well aware how the metrics game goes - even inside the company it can be hard to disprove the metrics claimed, and people count on that. Even managers coach you on putting metrics you cannot prove or disprove.
    - hiAndrewQuinn 1 hour ago
      What counts as a keyword here? If you're hiring for a frontend developer and you see e.g. "Redux" do you just can it?
      [-]
      - andsoitis 1 hour ago
        Knowing or having experience with Redux isn’t going to cause me to pick you over someone else who doesn’t list it for a job where I’m paying you hundreds of thousands of dollars. I look at other skills.
        I would not can it in isolation, but if I see a comma-separated list like: “proficient in redux, react, html, JavaScript, sql, kubernetes, word and excel”… then yes, you don’t make the cut.
        Or if you list your Microsoft qualifications or your MIT continuing education courses. These are all negative signals.
        [-]
        coder68 51 minutes ago
        What would be an example of a positive signal?
        mcv 16 minutes ago
        Unfortunately many recruiters do look at that. I'm always a bit disappointed when someone wants me to rate my Java experience, or complains that my CV doesn't mention REST experience.
      - tkiolp4 1 hour ago
        Metrics: I increased retention 2x; I reduced latency from X ms to Y ms; increased slo to 99.999… those are all meaningless. It was in fashion to put such numbers in cvs maybe 5-10 years ago. Not anymore
        [-]
        andsoitis 1 hour ago
        They were always lies because they’re imprecise. “I” didn’t do any of those things, you did other things together with other people leveraging company infrastructure to accomplish those things. Tell me about the SKILLS you excel in tha make those things happen.
        [-]
        mcv 21 minutes ago
        In my case it's not a lie: I reduced the time for a complex import process from 1 hour to 3 minutes, a 20 fold improvement. I included it in my CV, but now I wonder if I should take it out.
        Melatonic 32 minutes ago
        Why would you not want to know a general idea of what specific technology someone is familiar with ? Someone could be an "infrastructure engineer" and be more proficient in specific tools vs others - don't you want to match that to the job your hiring for ?
    - reillyse 1 hour ago
      You must be a pre 5.0 model so.
    - ai_slop_hater 44 minutes ago
      Gigachad. Just don’t forget to signal somehow that you aren’t like everyone else, so that legitimate candidates can send their real resume instead of AI generated one.
  - j45 1 hour ago
    Having implemented more than a few applicant tracking systems, too many are so anchored in the past, that they would probably try to boil the ocean at once by letting AI loose on it, leaving an ability for ai resumes to ai applicant tracking systems.
    The key insight here is humans are responsible for improved articulation to the ai, who in turn will improve the rest, and that can be as detailed and informative, and educational as the human likes.
  - newsy-combi 1 hour ago
    Kafkaesque
    [-]
    - kafkaesque 48 minutes ago
      The Orwellian nightmare is nigh
- p_stuart82 21 minutes ago
  that's the loop though. if GPT does the screening, people learn to write for GPT. once that loop exists, why would the company selling the filter want it gone?
- spike021 55 minutes ago
  I was recently job hunting and did something similar. Had it check my bullets and see if they "read well" and it suggested many many tweaks. I tried a few. I'm not sure how much more it helped the applications though.
- Esophagus4 2 hours ago
  There are services that will do this as well - I’ve used them both on my LinkedIn and resume with decent success.
- fuzzy_biscuit 2 hours ago
  I've done as you described and then edited it down to sound human again.
- amelius 2 hours ago
  I suppose the HR folks gave you a "+1 knows how to use AI".
  [-]
  - grey-area 2 hours ago
    It seems more likely the HR people depend on LLMs to do the job of screening and LLMs unsurprisingly prefer LLM output and rank it highly.
    It’s not lazy incompetence, it’s quietly getting the job done with 1% of the effort (that was a sarcastic pastiche, in case anyone was unsure).
    [-]
    - zdragnar 1 hour ago
      It's not uncommon to get hundreds or thousands of applications per opening for web tech, if the position is advertised on LinkedIn or a similar job board.
      They'd need to use some automation, even if it is just picking ten at random.
      [-]
      - normie3000 1 hour ago
        Maybe? I've filtered 300-400 CVs by hand before, and didn't find it particularly time consuming to bin the ones which clearly didn't meet requirements or have any redeeming features. And hiring was not my full-time role.
        [-]
        anthuswilliams 23 minutes ago
        At 90 seconds per resume, that would take up a full 8 hour day. Having gone through this myself, I don't think it's possible to do this much faster than that, even if you have an ATS that optimizes for that workflow.
        I often found myself falling into patterns of poor judgement, e.g. mentally filtering out resumes based on the layout because, to my tired and bored mind, they looked similar to the resumes I had seen from unqualified candidates. I actually think some automation is helpful in evaluating them more rigorously.
        cyberax 27 minutes ago
        The last time I posted on HN in the 1-st of the month hiring post, I got around 2 thousand resumes. Pretty much all of them were this kind of: "Increased the performance of the service by 23.123213%" collection of bullet points.
        PS: I replied to most of them, I think, but I'm sorry if I missed somebody :(
  - ben_w 2 hours ago
    Some will, others openly say on the job ad they will fail you for using AI.
    [-]
    - izacus 2 hours ago
      And then still use a CV scanning service that rejects non-AI resumes.
    - dawnerd 2 hours ago
      I know if I got a resume from someone that had obviously used AI to generate it, it would be a pass.
      [-]
      - drillsteps5 1 hour ago
        Before the resume ends up in the hiring manager's inbox it needs to be picked by the recruiter from literally hundreds of others. The recruiter uses HR software to determine the match (usually the percentage), and then picks top 5% or top 20 or whatever highest ranked resumes.
        Guess what's doing the ranking.
      - bell-cot 1 hour ago
        What if your own HR's LLM didn't send you any other kind?
- davebren 1 hour ago
  Probably gonna get downvoted for this, but when you give an anecdote you don't have to preface it with "anecdata, n=1 sample size".
  We know it's from your individual experience because it's a story about your individual experience. We've been doing this for all of human history. This is some kind of strange milieu of trying to always sound scientific, or it's fear of the "well akshually I'm gonna need to see a random placebo controlled trial", which is equally annoying.
  [-]
  - Fezzik 54 minutes ago
    It became necessary because, for years (decades), if you made a comment online that your personal experience informed you in such-and-such a way, the first comment would always be some moronic comment dismissing that personal experience because it is just one person’s experience. So, to avoid that idiocy, people started to preface their anecdotes by acknowledging that they know it is an anecdote. It sets the tone for the conversation.
    [-]
    - davebren 18 minutes ago
      Yeah but we can't let the insufferable dictate our way of speaking. In spoken language I hear it mainly by people that don't have a scientific background trying to sound more scientific.
  - peyton 1 hour ago
    I’ve been told explicitly to do what GP said, so it’s perhaps becoming word-of-mouth career advice at this point. In my case it told a different career story that is maybe more easily digestible.
- tayo42 1 hour ago
  Llms were good for being objective and helping cut out stuff from mine. Harder to do when you personally think everything you ever did is important.
  [-]
  - davebren 1 hour ago
    It actually is important and if I was hiring you I'd find it useful to get a more comprehensive understanding of your experience, especially if there's something I'm aware is a very challenging problem to solve. And it would provide more things to cross-examine in interviews to make sure it's not fake. The idea that people hiring are saving time by not reading an extra resume page when deciding on someone that will hopefully work there for years is ridiculous.
    For some reason that's the minority opinion because everything has to be dumbed down now.
    And how is a resume with the most important or recent work highlighted and at the top worse than a resume with that plus the rest of your experience after it?
    [-]
    - nickserv 1 hour ago
      Same. I've gotten into disagreements with HR over this when hiring for my teams.
    - tayo42 15 minutes ago
      If I could give two resumes one for the recruiter and one for the hm I would.
      But as an applicant, I'm dealing with recruiters who think Java and Javascript are basically the same.
- fecalmatter 2 hours ago
  [flagged]
bendergarcia 2 hours ago
We are without our consent introducing a party in between people. The models become the arbiters of who does and does not get a job. It feels problematic.
[-]
- justonceokay 1 hour ago
  There will be a great arbitrage for people who do not use LLMs.
  If your HR department is using ChatGPT to filter resumes, you’ll end up with people who used ChatGPT to generate resumes. I don’t want to make a “slippery slope“ argument, but my gut feeling is that the quality of your organization will deteriorate quickly.
  On the other hand, I am a handyman/subcontractor. Almost all of my work comes through phone calls, texts, and one-off emails. I only work with people that are recommended by a trusted sources. I haven’t handled a traditional resume (mine or other people’s) in over eight years.
  If I started interacting with somebody and they seemed like they were a computer, that would be the fastest way for me to know I should move on to another client. If they can’t take the time to interact with me, how am I supposed to perform hundreds of hours of physical labor for them?
- bendergarcia 2 hours ago
  And I feel the common response of: well just use the model that’s available. Ai is and will probably always be resource constrained and profit driven, that means we will eventually see a world where poor people have worse resumes than rich people and there really won’t be any way around it because the man in the middle has the final say
  [-]
  - adrianN 1 hour ago
    Not too long ago I bet resumes that were printed from a computer were preferred to resumes typed on a typewriter. What happened was that computers became commodities. It is reasonable to assume that LLMs will become commodified too.
    [-]
    - YurgenJurgensen 1 hour ago
      That would hardly be surprising. Monospaced fonts make natural language a pain to read, so what that would prove is that well-presented resumes are preferred to poorly-presented ones.
      This case is different, as the LLM output isn’t measurably better than the human output (unless you have a particular love of bland corpo-speak).
    - Nuzzerino 1 hour ago
      This is a terrible way to soften an obvious alignment failure with AI rollout.
- falcor84 1 hour ago
  The ship has sailed as soon as hiring managers stopped reading cv's directly and we got recruiters as a profession.
- ekianjo 2 hours ago
  before it used to be HR, so you always had a party in between "actual" people. HR (mostly) never cared about the CV, they just look at a checklist and see if it matches.
- sneak 2 hours ago
  We already did that when we all created LinkedIn accounts.
- sxg 2 hours ago
  Take a look at how things worked before (and still do): employers decide who get jobs based on a combination of personal biases, nepotism, and ulterior motives while applicants present distorted versions of themselves and network/pull strings to put the odds in their favor. That seems more problematic.
  [-]
  - 1attice 1 hour ago
    You would be surprised at the process in other industries. What you are describing is the tech job market specifically.
    Other fields have their own problems, including credentialism and ballooning concomitant student loans, but do, by strict convention, not hire based on vibes or pulled strings. Often to their partial detriment, as the cure -- ie, strict oversight of hiring that also forces the hiring manager to ignore important implicit signals -- is alive and well in medicine, law, civil engineering, education, and the trades. Notable exceptions include entertainment, sales, real estate, and software engineering.
    By optimizing for vibes, the tech industry gains "Spidey senses" in the hiring loop but pays for it in impartiality.
    IMO this precipitated the DEI movement's advent, as it was seen as a way of remediating the drawbacks while preserving the information channel.
    Without it, expect either homophily, and, eventually, a harsh and remedial credentialism.
benashford 2 hours ago
Intuitively this feels obvious. Content generated by the model will be shaped by its training, therefore when reading it back it will resonate with that same training and have a positive view as a result.
Human when preparing a CV: "Make my CV more professional"
LLM many days later presenting a report to HR: "This CV is really professional"
There's probably more to it than that of course.
But it justifies my personal policy of using a different LLM family for code review tasks than for code generation tasks. To avoid the "marking your own homework" problem.
[-]
- gzread 1 hour ago
  And not in human-interpretable ways. An LLM was told to behave in a certain way and then output random numbers. When the numbers were pasted to another LLM instance, it also behaved that way. I wish I remembered more about that study or had a link to it - it was fascinating.
  [-]
  - mnicky 1 hour ago
    Wasn't it this one?
    Article: https://alignment.anthropic.com/2025/subliminal-learning/
    Paper: https://arxiv.org/abs/2507.14805
mcv 32 minutes ago
Timely topic for me. My CV had grown to 7 pages, and I kept reading everywhere that it should be no more than 2, so I asked Gemini to rewrite it. Took a lot of time, because Gemini loves to exaggerate everything, but I'm quite happy with the result.
The first couple of recruiters I sent it to preferred my old 7 page CV. I guess they're not using enough AI yet.
rogermarley 2 hours ago
I think resumes will eventually (or have already) become obsolete in tech. The SNR is so low, they offer very thin filtering value.
Even taking the tiny bits of the resume that are "hard signal", like GPA, certifications, prior roles, etc, it doesn't translate into their performance in the initial screening interview.
This is why what I think the industry sorely needs is examination consortia.
Rather than trying to guess capability from the name of the university they went to, leading tech companies creating standardized tests in various fields, and your test scores form your "resume", so that developers can just focus on improving their scores rather than wasting time on resume/application/repetitive-screening toil.
[-]
- indiv0 2 hours ago
  Eventually even a system like that can be gamed, similarly to how Leetcode-maxxing and the like sprung up in response to typical SV interview questions. Studying for the job becomes studying for the test becomes studying for the pre-test test.
- aDyslecticCrow 1 hour ago
  > standardized tests in various fields
  This is itself a massively difficult problem. Standardised tests are bad indicator of topic understanding. (setting aside the massive incentive for blatant cheating)
  You're effectively advocating for leetcode being effective hiring tool, which many would highly criticize.
- cyberax 22 minutes ago
  It's hard to design tests for CS. Leetcode is too simplistic, it just tests the basic algorithmic knowledge that is nearly useless for regular software development.
AlexB138 2 hours ago
This may lead to some interesting gamesmanship. For instance, if I am applying to a company, and I know they use a certain applicant tracking system, and I know that ATS uses a certain model provider for its filter, I should then use that model to write the version of my resume I send to the company.
[-]
- mft_ 1 hour ago
  Good observation. There are so many versions of the future that just become an LLM arms race.
ivansmf 1 hour ago
I suspect the entire industry uses "auto-raters", where an agent instance is used to scores the agent's output. The idea is similar in intent as using adversarial networks to train image generation, minus the human labelers. Raising the scores of the auto-rater then becomes the metric teams optimize, and it is no wonder the end result is that the agent scores its own generated content the highest.
aykutseker 1 hour ago
The uncomfortable part is that this is probably rational behavior for both sides.
Employers use models to filter resumes, candidates optimize resumes for those models, and suddenly the resume is no longer written for a human at all.
drillsteps5 1 hour ago
That's what people on both side have been doing for at least couple years already.
Recruiters scan resumes for the best match with LLMs, candidates use the same LLMs (there's only like 3 of them) to tweak their resume for better match. I don't know what research you need to see why that makes sense.
[-]
- yagi0x00 1 hour ago
  This indicates that resumes created by the same model may have an advantage over those created by other model, so I suppose technically you may have a small advantage if an insider tells you the resume parsing tool is powered by Gemini as opposed to the other models.
  My broader discomfort is that we are still learning about model biases while human biases are arguably better understood, and I don't like the ethics of rejecting a person based on criteria I don't fully understand.
  [-]
  - drillsteps5 9 minutes ago
    I wasn't saying that this is the optimal solution (it clearly is not). I was saying that it makes perfect sense for both sides - HR has their work automated and candidates have better chance to be noticed - and therefore became a common practice in many places.
    The well has been already poisoned, to survive you have to get in on the action.
    Don't want to play this game? Make connections, set up the network, and use it to get/stay employed.
- aDyslecticCrow 1 hour ago
  It further makes expecting or spending the effort hand writing a proper introduction useless. Which then undermine the entire purpose of it.
visarga 2 hours ago
When classifying resumes it is better to use the LLM as a feature extractor, think of 10-20 features you base your decision on, and extract them by LLM. The LLM only needs to do lower level task of question answering. Then you fit a classical ML model (xgboost for example) on the extracted features, based on company triage data points. This way you don't rely on the biases in the model, you can decide what criteria to use and how to judge cases without retraining the LLM. The feature extractor is generic, and the actual triage model is a toy you can retrain in seconds on new data points. It is also much more explainable, you can see how features influence decisions.
[-]
- aDyslecticCrow 1 hour ago
  I'd rather my employers just does the classic of shredding random 80% and looking at the remainder properly.
  [-]
  - cyberax 20 minutes ago
    Ah, the good old "we don't need unlucky losers here" strategem.
onlyrealcuzzo 49 minutes ago
Further, LLMs consistently think LLM written content is "good".
Ask an LLM to write some design doc for you, wait until you get one that's very bad, send it to other LLMs and get their feedback, they will typically have good things to say.
Compare that to a very well written document you have. They will typically have a lot more bad things to say, even if the premise is solid.
Someone should study this.
LLMs clearly have a lot of value. But IMO this is very interesting and points out a weakness that's not entirely clear what the full ramifications of it are.
I suspect LLMs also have a major bias to code they write.
Take something universally considered to be well written like Redis, feed it to an LLM for feedback. They'll probably find much to pick apart (and a lot of it may be flat out wrong).
Feed the same LLM some clearly garbage LLM repository. Do they have a similar response as they do with design? Do they treat language different than code, and they're just susceptible to the way they write regular language that's different from logical code? Or do they have the same problem?
Has anyone done this?
logicalfails 2 hours ago
I suspect this is more a function of the corporate sanitization of language within the models. When I have passed my resume through the models for refinement, it often sanitizes some of the more easy going or simpler wording. It expands the vocabulary, makes it more dense, and uses more corpo speak in the bullets and formatting.
Each model likely has its own biases in terms of what constitutes correct corporate speak, and it chooses the resumes that best fit this. Ultimately, I suspect it's more a function of model saying "this grammer, syntax structure, and formatting is most aligned with what is correct corporate language, so flag as high quality".
ilia-a 2 hours ago
Seems kinda obvious, given that most large recruiting firms/hr use algos to analyze resumes and AI written version likely do a better job at hitting keywords/structure algos/llms pick up on...
embedding-shape 2 hours ago
You'll find the same is true if you have two different LLMs first independently come up with a plan for an implementation, then ask each one of them to say which one of the two designs/plans are the best. They're much more likely to favor the plans generated from the same model, rather than from other models. I'm sure, internally, this somehow makes sense, but it's worth thinking about if you're doing the whole "ask N models for voting/rating N plans to find the best" charade.
[-]
- SeriousM 2 hours ago
  That's why I let the LM write it's own AGENT.md or SAFESPOT.md because it "knows" best how to write it so it can resume next time without issues.
  Is hits the same spot as that I would take other notes than anyone else and no one could follow them as easily than I do. Everyone leaves the "of course" parts out of the notes if it's for the own use.
sb057 2 hours ago
Well yeah, LLMs generate resumes (and other text) that they judge as superior to alternative plausible texts. Why would that judgement change just because a different instance hasn't seen it before? To anthropomorphize it, it's like having a hiring manager write a resume, get amnesia, and then have to judge it among other resumes.
[-]
- Ekaros 2 hours ago
  Seems like obvious thing. If LLM have some weights involved on what is good resume to write there is very likely correlation to what would be good resume to rate. And this is probably a even good thing, at least from model quality perspective. Model itself should rate highly whatever it produces. There should be correlation between output and review of same output.
- bendergarcia 2 hours ago
  I wouldn’t put it past these tech companies to prefer ai outputs to encourage ai inputs
ryeguy_24 2 hours ago
Does anyone know of any HR departments actually using LLMs for scoring, selection, extraction, classification or any real use cases? I'm curious to hear about it and how they are using it.
[-]
- oogetyboogety 1 hour ago
  We were told by hr NY has strict state laws against this
- redbonsai 2 hours ago
  There's an AI layer built into most ATS systems as well as LinkedIn and Indeed
  [-]
  - ryeguy_24 2 hours ago
    Could you share more detail on how the AI layer is used? Is it an LLM?
    [-]
    - ahoka 1 hour ago
      Just one example of many: https://support.teamtailor.com/en/articles/9424192-co-pilot-...
jimnotgym 2 hours ago
I just guessed that and got Copilot to rewrite my profile on the internal HR system. I also got a job spec benchmarked higher by getting Copilot to write it with that exact aim given in the prompt
[-]
- fecalmatter 2 hours ago
  i straight up lied about my work experience
  we are exactly the same
mpurbo 2 hours ago
At this point, all these are becoming almost like comedy.
einpoklum 2 hours ago
> As artificial intelligence (AI) tools become widely adopted, large language models (LLMs) are increasingly involved ... [in] ... decision-making processes
That's the problem right there.
[-]
- bendergarcia 2 hours ago
  Absolutely! I don’t think people are really considering the full effects of just letting ai be the middle man. I mean Sam Altman basically said this is what he wants Gwen he said intelligence is a commodity no?
jamiecurle 2 hours ago
disclaimer: Not a lawyer, but studying towards CIPP/E.
You'd make no friends doing it, but as I understand it, for those that have GDPR as a statutory right then under "[Article 22 - Automated individual decision-making, including profiling][0]" you can request to know if your CV was screened by AI and what (and this is key) "meaningful human interaction" led to that decision. Technically this falls under a data subject access request and so a response is mandatory (but who really is going to enforce that - ICO / <insert your data protection agency here> probably isn't). Companies can't just smash a button and claim meaningful interaction, it has to be, well, meaningful and smashing a "nope" button obviously isn't meaninful.
If it turns out that it was only AI that screened it you can request a human review. Do not hold your breath.
Again, you'd make no friends doing it, but sooner or later a test case will emerge to generate some case law around "AI said no" because employment, or lack of because AI says no, does have significant impact on a human.
[0]: https://gdpr.algolia.com/gdpr-article-22
[-]
- noprocrasted 1 hour ago
  The issue is that indeed, nobody is going to enforce that.
analog8374 1 hour ago
This means that LLM human resource departments will only hire LLMs. Which is kind of beautiful.
makeitrain 2 hours ago
Vibe resume?
[-]
- masfuerte 2 hours ago
  Aka VCV.
- alexgotoi 2 hours ago
  I was about to write the same thing…
booleandilemma 1 hour ago
HR departments aren't using LLMs to select candidates for jobs are they?
abubakir1997 1 hour ago
Very interesting.
bjourne 2 hours ago
The only test that has worked 100% of the time for me is to read the candidate's code. Two hours is enough to precisely estimate the candidate's qualities as a software developer. I never understood why companies waste time with tests and quizzes because since it is so easy for me it should be just as easy for other software developers too. Of course, a candidate may be a jerk or unfit for other reasons, but ranking them on a software developer hot-or-not scale is not very difficult.
[-]
- noprocrasted 1 hour ago
  Just like they'll send you an LLM'd resume, they will send you LLM'd code.
  [-]
  - bjourne 1 hour ago
    Conceptually no different from copy-pasting someone else's code.
parentheses 2 hours ago
Reading only the abstract: LLMs prefer output of their own generation over humans or even other models.
This is a very good reason to avoid using model-generated data to train future models. We'd be deepening this bias by continuing to do that, essentially forcing society to reshape their output using LLMs to increase engagement. This feels like a form of enshittification that doesn't just touch one product but all of society.
jonahs197 2 hours ago
Will people snap over this?
cyberax 14 minutes ago
As always, XKCD is prescient here: https://xkcd.com/2237/
bdangubic 1 hour ago
My new CV contains 37 emdashes
interstice 1 hour ago
"I'm not just good, I'm amazing"
Der_Einzige 2 hours ago
This is extremely obvious to anyone whose read other papers. There's tons of papers showing LLMs prefer their own outputs. It's a big enough problem that LLM-as-judge has to be a different LLM from the LLM you are testing in papers.
jqpabc123 2 hours ago
Repeat after me --- it makes no sense to try and prompt a language prediction engine to display good judgment.
randomdrake 2 hours ago
I wonder if this extends to training models on new content as well. Are we creating a cyclical information-consumption and training situation in which models being trained are more likely to pick up on and reference content created by themselves or by other LLMs than by other humans?
modzu 58 minutes ago
ais interviewing ais... lol
samagragune 15 minutes ago
[dead]
johndhi 2 hours ago
Another way to phrase this might be that LLMs make better resumes no?
[-]
- budoso 2 hours ago
  If that were the case they would select the ones generated by other models at a similar rate to the ones they generated themselves.
- delecti 2 hours ago
  You'd have to define "better".
  All this shows is that LLMs generate resumes that fit the heuristics LLMs use to judge resumes. And that makes sense, but isn't necessarily a given.
- mrktf 2 hours ago
  Or in other words: LLM it is optimizing function which is generated by same LLM, think you have random variable y, where generator sin(x+r) and your optimizer trying to fit function sin(x+unkown1) + unknown2 ("unknown" function) - it is obvious that will find best fit.
- rectang 2 hours ago
  By one metric, yes!
  If you are a candidate who wants to be hired, and your target employers use LLMs to filter resumes, then an LLM-generated resume that the employer LLM-powered resume filters favor is "better" — as in "more likely to get you the job".
- jezzamon 2 hours ago
  In text generation, LLM language is full of very emphatic phrases. At a surface level it might sound stronger. But as a human reader, it's not necessarily better
- mathgeek 2 hours ago
  *for getting past ATS reviews.
- Emanation 2 hours ago
  Where I work, my boss decided to make an application that uses AI to score long text field entries to ensure required information is present.
  The AI lacks the ability to extract nuance and implicit information, which means entires end up being long winded and repeatitive. For each requirement its looking for, it must be explicity expressed-- it's quite unnatural, and almost feels like solving a puzzle, to which the obvious solution is to write a comment, then give it and the AI feedback to a failing comment to AI, so it can generate the proper structure the rubric-AI is looking for.
  LLMs are statistically driven, and I can only imagine having the AI rewrite the comment produces a result that's more statistically fitting to the model than if any given human were to write it. So, it might mean, yeah, LLMs are better at writing resumes that the LLM can successfully classify-- are they better for a human to consume? Who knows.
nottorp 2 hours ago
Easy then. Apply N times, each time with a resume generated by a different LLM.
No human is going to notice anyway. Or add a N+1 resume written by yourself in which you describe your strategy, just in case.
[-]
- zipy124 2 hours ago
  Do you really believe no human is going to read your resume at some point in the process and notice the classic AI tells?
  Further de-duplication is rather easy, and will likely see you black-listed by competant organisations.
  [-]
  - stingraycharles 2 hours ago
    “Do you really believe no human is going to read your resume at some point in the process and notice the classic AI tells?”
    Even here on HN many people don’t recognize AI tells that are obvious. Pretty much 100% of all articles posted on HN have been AI generated for months and months already and people don’t seem to care.
    I have very little faith in humanity being able to deal with the chaos that LLMs are going to unleash on society.
    Heck, most resumes are probably skimmed at best already.
  - cl0ckt0wer 2 hours ago
    The only resumes that make it past the ai to a human are ai generated
    [-]
    - Esophagus4 2 hours ago
      When I’m hiring, a human recruiter (or the hiring manager) reads most resumes.
      For us, there is some sorting by basic keyword analysis and we start near the top, but there is no proverbial black box that rejects candidates outright.
      If candidates are ignored by humans, it’s not because AI rejected them, it’s because we are starting with candidates earlier in the list and might not make it to applicant 537.
    - zipy124 2 hours ago
      Rather unlikely to be the case, supported by the original article itself here, since if your statement was to be the case they would find that the human generated resume is 100% less likely to be shortlisted.
      [-]
      - stingraycharles 2 hours ago
        Obviously it’s not 100% of all human resumes are going to be filtered out, but it’s quite damning that human resumes are more likely to be filtered out just because they didn’t LLM-ify it.
  - nottorp 1 hour ago
    In organizations where LLMs sort the resumes yes, I believe no human will read my resume until it's too late.
- stingraycharles 2 hours ago
  You don’t understand the problem.
  Companies are using AI / LLMs to pre-filter resumes. These AIs prefer their own slop resumes. Not just human vs LLMs, but Claude prefers Claude resumes over ChatGPT. Nothing good can come out of that, when resumes are pre-filtered like that.
  Unless, of course, you’re not being serious and just trying to be edgy on HN.
  [-]
  - DiscourseFan 2 hours ago
    Why would I want to work for a company where all the employees made slop to get hired by slop to do slop? It’s slop all the way down!
    [-]
    - stingraycharles 2 hours ago
      Because this is where the industry as a whole is moving towards, and you don’t want to be out of a job I presume.
    - falsemyrmidon 1 hour ago
      [dead]
- almostdeadguy 2 hours ago
  Happy for everyone trying to invent SEO hacking for resumes.
skeledrew 1 hour ago
Pretty straight forward IMO. The model is looking for particular qualities in a given resume, and strives to ensure the qualities it looks for is present in resumes it creates. Humans do the exact same thing (unless forced by something like DEI, etc to do otherwise), so I see nothing noteworthy here.
idopmstuff 1 hour ago
Even if we take this to be true, I'm not sure that it really matters?
It's comparing two resumes with the same information and picking one of the two. That's obviously a situation that would never occur in actual hiring. This doesn't demonstrate anything at all that indicates that LLMs would incorrectly preference LLM-written resumes in the real world.
It'd be interesting to do the same thing but with two resumes that are almost identical. One is slightly better (an extra year of experience or a specific note of some skill that is relevant to the role), and the other slightly worse one is written by an LLM. If the reviewing LLM picks the worse one in that case, you're potentially establishing a bias that would matter. As it stands this experiment just seems contrived and pointless.
[-]
- neom 1 hour ago
  I did a very hack job version of this as your question got me curious, but again, not at all a rigorous test. I took my resume and had an LLM re-write the exec summary, then changed the names of the business to comparable ones and gave one a couple more years experience, then prompted: "2 Candidates that are very similar on paper, If you had to pick just based on these 2 resumes for a GM & Marketing of a series A robotics startup, who would you pick? Just reply with the name."
  It picked the one with the LLM exec summary and less experience, then I said: "Interesting, less experience, why did you pick him?"
  It said:
  "Honestly? Looking back at the two, I shouldn't have answered that confidently. They're nearly identical on paper — same career arc, same era, same kind of advisory portfolio, both went from community roles → hosting/cloud operator → founder → VC partner. The scale claims are even comparable (Vultr "hundreds of millions ARR" vs. DigitalOcean "$500M+ ARR," both 10 → hundreds of employees).
  If I'm being rigorous about why John edged it for me: The DigitalOcean tenure is more legible and verifiable — "2nd largest cloud provider," reporting to the CEO, specific scope (devrel, people ops, product ops). Joshua's Vultr description is a bit vaguer on role and outcome. For a GM hire at a Series A robotics company, where you're betting on someone to actually run the operating function, I lean toward the resume that shows clearer operational ownership rather than a marketing/CMO title.
  John also has Tiny Mile (sidewalk delivery robots) and Starship-adjacent exposure — wait, actually Joshua has Starship Technologies as an advisor, which is the more relevant robotics signal. That cuts the other way.
  Truthfully, this is a coin flip and I called it too quickly. You'd want to dig into references, what they actually did day-to-day at Vultr/DO, and how they think about hardware/capex businesses before picking."
  Opus 4.7 and again, pretty hack job test but I was curious.