Hey, wait – is employee performance Gaussian distributed?

(timdellinger.substack.com)

162 points | by timdellinger 3 hours ago ago

127 comments

  • sangnoir an hour ago

    > Performance management, as practiced in many large corporations in 2024, is an outdated technology that is in need of an update

    Author made a couple of fundamental mistakes: the first is they assume employees are (or should be) paid according to how much they "individually" earned the company. Employers strive to pay employees the minimum they can bear, on employers terms. Those terms are information asymmetry and a Gaussian distribution. Fairness is the last thing one should expect from employers, but being honest about this is not good for morale, so instead, they rely on keeping employees uninformed, while the employers collude to gather everyone's remuneration history via the Work Number.

    The second mistake they made is assume that companies would prioritize being lean and trimming the mediocre & bottom 5%. There are other considerations, combined productivity is more important than having individual superstars working on the shiniest features. How much revenue do you think a janitor or café staffer generates? Close to zero. The same goes for engineering. Someone has to do the unglamorous staff, or you end up with a dysfunctional company, with amazing talent (on paper).

    Edit: there's an infamous graph that shows when aggregate worker productivity and average income. The two tracked closely, rising in tandem until the 1970s, where they got decoupled. With income becoming much flatter, and productivity continuing to rise. That's how the world has been for the past 50 years on the macro and the micro

    • ec109685 20 minutes ago

      Employers want to pay the minimum, clearly, but until a person’s salary exceeds the value they bring to a firm, there will be other firms willing to pay more and attract that talent. So provides some upward pressure on wages, which the author addresses:

      > Economists will teach you something called the Marginal Productivity Theory of Wages, the idea being that the amount of money that a company is willing to spend on an employee is essentially the value that the company expects to get out of their work. This strikes me as mostly true, most of the time, and likely to be the case in the corporate world that we’re considering here.

    • hemloc_io 15 minutes ago

      When I first heard of the Work Number, I thought there's no way they stay in business given the Real Page suit.

    • diggan an hour ago

      > the first is they assume employees are (or should be) paid according to how much they earned the company

      From the perspective of a employee and/or human, that does seem like the most fair way of distributing what the company earns, sans the money that gets reinvested straight back into the business itself. But I'd guess that'd be more of a co-operative, and less like the typical for-profit company most companies are today.

      • stoperaticless 40 minutes ago

        There is no way to unambiguosly decide who is responsible for which earnings.

        Hipothetical two people cooperative that produces simple hammers. One specializes on wooden part, the other on metal part. How much each of them earned to the company? (Or producing and selling; or one spending his lifesavings to buy pricey hammer-making-equipment while other presses buttons on said equipment)

      • no_wizard 11 minutes ago

        Even with sales based around commission, the most objective sort of salary determination, businesses still find ways to undercut payouts if they don’t think it’ll hurt the bottom line or employers won’t notice

      • nkrisc 30 minutes ago

        Hopefully you don’t get assigned to fixing bugs, because then you may not earn any money.

      • dingnuts 44 minutes ago

        Did you even finish reading the comment you're replying to? It explicitly explains why employees who do not generate revenue are still valuable.

        What you're describing, that money would go to whoever brings in revenue directly, is the myopic viewpoint of Sales with an emphasis on closing deals with nothing else. If it wasn't for the rest of the work, there'd be nothing to sell!

    • deepnet 26 minutes ago

      My takeaway ( and an indication of who actually needs a performance review [ e.g. the manager ])

      “ It’s my opinion that the biggest factor in an employee's performance – perhaps bigger than the employee’s abilities and level of effort – is whether their manager set them up for success “

      • kozikow 18 minutes ago

        Or other way around - in bigcorp (or in startup) choosing what to work on have much bigger impact than the work you do.

        On very low level it's up to your manager. As time goes, even as IC you have a lot of agency. It's not just company selection, team selection, but also which part of the project you are working on and how you are approaching solving it.

        Of course "if everyone does this, who will fix the bugs". However, the quickest promoted people I've seen are the people who were excellent at politics-izing (and sometimes foresight) the best work assigned to them.

  • ianbicking an hour ago

    "IQ is Gaussian" – it was pointed out somewhere, and only then became obvious to me, that IQ is not Gaussian. The distribution is manufactured.

    If you have 1000 possible IQ questions, you can ask a bunch of people those questions, and then pick out 100 questions that form a Gaussian distribution. This is how IQ tests are created.

    This is not unreasonable... if you picked out 100 super easy questions you wouldn't get much information, everyone would be in the "knows quite a lot" category. But you could try to create a uniform distribution, for instance, and still have a test that is usefully sensitive. But if you worry about the accuracy of the test then a Gaussian distribution is kind of convenient... there's this expectation that 50th percentile is not that different than 55th percentile, and people mostly care about that 5% difference only with 90th vs 95th. (But I don't think people care much about the difference between 10th percentile and 5th... which might imply an actual Pareto distribution, though I think it probably reflects more on societal attention)

    Anyway, kind of an aside, but also similar to what the article itself is talking about

    • FredPret an hour ago

      This is a subtle aspect of intelligence measurement that not many people think about.

      To go from an IQ of 100 to 130 might require an increase in brainpower of x, and from 130 to 170 might require 3x for example, and from 170-171 might be 9x compared to 100.

      We have to have a relative scale and contrive a Gaussian from the scores because we don’t have an absolute measure of intelligence.

      It would be a monumental achievement if computer science ever advances to the point where we have a mathematical way of determining the minimum absolute intelligence required to solve a given problem.

      • silvestrov 19 minutes ago

        I wonder how a graph looks for "how many seconds does it take people to run 100 meters".

        Might be a mix because quite a number of older or overweight people runs very slowly and some can't at all.

      • logicchains 30 minutes ago

        >It would be a monumental achievement if computer science ever advances to the point where we have a mathematical way of determining the minimum absolute intelligence required to solve a given problem

        For a huge number of problems (including many on IQ tests) computer science does in fact have a mathematical way of determining the minimum absolute amount of compute necessary to solve the problem. That's what complexity theory is. Then it's just a matter of estimating someone's "compute" from how fast they solve a given class of problems relative to some reference computer.

    • jppope 30 minutes ago

      Correct. IQ isn't an effective measurement of intelligence as is typically stated. It is (at best) a measurement of learning disabilities.

      • liontwist 26 minutes ago

        It’s a pretty good measurement of your ability to play logic games and fast pattern match.

        I’m sure we agree that doesn’t constitute “intelligence”, but it’s more than disability.

    • marcosdumay 35 minutes ago

      It's worse, because every test is obviously bounded, and it's absurd to not expect some noise to be there.

      Join those two, and the test only becomes reasonable near the middle. But the middle is exactly where the pick of questions makes the most difference.

      All said, this means that IQ is kinda useful for sociological studies with large samples. But if you use it you are adding error, it's not reasonable to expect that error not to correlate with whatever you are looking at (since nobody understands it well), and it's not reasonable to expect the results to be stable. And it's really useless to make decisions based on small sample sizes.

    • fnordlord an hour ago

      I didn't know that about how IQ tests are formed. Would that mean that there could be some sliver of the population that could score in the top %'s on the 1000 question test but due to the selection of questions, scored average on the final IQ exam? If so, that'll be my excuse next time I have to take an IQ exam. I just got the wrong distribution.

    • sapiogram 27 minutes ago

      > and then pick out 100 questions that form a Gaussian distribution. This is how IQ tests are created.

      You missed an extremely important final step. People's scores on those 100 questions still aren't going to form a Gaussion distribution. You have to rank-order everyone's scores, then you assign the final IQ scores based on each person's ranking, not their raw score.

  • bhouston 2 hours ago

    I would feel better if this was derived from empirical data rather than just rhetoric. This seems super testable, no? There is probably a ton of data already in different industries with regards to productivity.

    Even if human talent have a Pareto distribution (which is not clear), the people employed by a company are a selected sub-set of that population, which would likely have a different distribution depending on how they are selected and the task at hand.

    I think that any of these simplified distributions are likely not generalizable across companies and industries (e.g. productivity of AWS or Google employees are likely not distributed like employees of MacDonalds or Wal*Mart because of the difference in hiring procedures and the nature of the tasks.)

    Get hard data within the companies and industry you are in and then you can make some arguments. Otherwise, I feel it is too easy to just be talking up a sand castle that has no solid footing.

    • Miraltar 2 hours ago

      To me it says that our system is built on a reasonable but untested assumption (performance is a gaussian) and by replacing it with an equally reasonable assumption (performance is a pareto), suddenly our system looks stupid. It isn't really offering a solution but a new perspective

    • pama an hour ago

      I thought that Bonus Content #1 and the references down the article were reasonably convincing. It would be great if large companies disclosed such details but it is unlikely.

    • wavemode an hour ago

      > I would feel better if this was derived from empirical data rather than just rhetoric.

      This exact statement applies to the practice of Gaussian performance ranking. It is pure corporate politics, it isn't founded in sound statistics.

      The present author at least provides multiple sources of statistical evidence for their beliefs, if you read the footnotes.

    • KK7NIL 2 hours ago

      The problem is that intellectual productivity is generally not possible to measure directly, so you instead end up with indirect measurements that assume a Gaussian distribution.

      IQ is famously Gaussian distributed... mainly because it's defined that way, not because human "intelligence" (good luck defining that) is Gaussian.

      If you look at board game Elo ratings (poor test for intelligence but we'll ignore that), they do not follow a Gaussian distribution, even though Elo assumes a Gaussian distribution for game outcomes (but not the population). So that's good evidence that aptitude/skill in intellectual subjects isn't Gaussian (but it's also not Pareto iirc).

      • EnergyAmy an hour ago

        Do you have a reference for Elo ratings not being Gaussian? A casual search shows lots of graphs and discussions saying it is.

        • KK7NIL 40 minutes ago

          Look at my reply to bhouston.

          Elo ratings for active players are close to Gaussian, but not quite, they show a very clear asymmetry, especially for OTB old school Elo (compared to online Glicko-2).

          The active players restriction is a big one and one I didn't assume I in my original statement.

      • bhouston an hour ago

        > so you instead end up with indirect measurements that assume a Gaussian distribution.

        100%. I was going to write something similar.

        > If you look at board game Elo ratings (poor test for intelligence but we'll ignore that), they do not follow a Gaussian distribution, even though Elo assumes a Gaussian distribution for game outcomes (but not the population). So that's good evidence that aptitude/skill in intellectual subjects isn't Gaussian (but it's also not Pareto iirc).

        Interesting, yeah, Elo is quite interesting. And one can view hiring in a company as something like selecting people for Elo above a certain score, but with some type of error distribution on top of that, probably Gaussian error. So what does a one sided Elo distribution look like with gaussian error in picking people above that Elo limit?

        • KK7NIL an hour ago

          Lichess has public population data (they use a modified version of Glicko-2 which is basically an updated version of Elo's system): https://lichess.org/stat/rating/distribution/blitz

          It's basically a Gaussian with a very long right tail.

          Big caveat here is that these are the ratings of weekly active players. If we instead include casual players, I suspect we'd have something resembling a pareto distribution.

          • JackFr 42 minutes ago

            Good question - do the bad players play less because they are bad, or are they bad because they play less?

            • bhouston 37 minutes ago

              > Good question - do the bad players play less because they are bad, or are they bad because they play less?

              Both for sure. If you don't practice you will never rise much about bad. But if you are bad and not progressing you won't play much because it isn't rewarding to lose.

              One needs to almost figure out those with low ELO ratings, what is their history compared to the number of games played and see if they were following an expected ELO progression.

              I wonder if you can estimate with any accuracy where a player will eventually plateau given just a small-ish sampling of their first games. Basically estimate the trajectory based on how they start and progress. This would be interesting. Given how studied Chess is, I expect this is already done to some extent somewhere.

      • jlawson an hour ago

        All polygenic traits would be Gaussian by default under the simplest assumptions.

        E.g. if there are N loci, and each locus has X alleles, and some of those alleles increase the trait more than others, the trait will ultimately present in a Gaussian distribution.

        i.e. if there are lots of genes that affect IQ, IQ will be a Gaussian curve across population.

        • KK7NIL an hour ago

          Very interested point, this is a close corollary to the central limit theorem, no?

          Doesn't this assume a linear relationship between relevant alleles and the given trait though?

          • boothby 6 minutes ago

            The missing assumptions are that the number of genes is large, independently distributed (i.e. no correlations among different genes), and identically distributed. And the whopper: that nurture has no impact.

            You can weaken some of those assumptions, but there are strong correlations amongst various genes, and between genes and nurture. And, one "nurture" variable is overwhelmingly correlated to many others: wealth.

            Unpacking wealth a little, for the sake of a counterexample: one can consider it to be the sum of a huge number of random variables. If the central limit theorem applied to any sum of random variables, it should be Gaussian, right? Nope, it's much closer to a Pareto distribution.

            In summary: the conclusion of the central limit theorem is very appealing to apply everywhere. But like any theorem, you need to pay close attention to the preconditions before you make that leap.

          • Bootvis 31 minutes ago

            It does. A lognormal distribution would model that better which gives a nice right tail so maybe it is a useful toy model.

    • drcwpl 2 hours ago

      Agree with you - although, rhetorically speaking, I have come across many instances which the author refers to "of low performers are 3x as common as high performers." This is unfortunate as I always think do your best, and as Tyler Cowen states - Average is Over. So agree it would have been way better to use empirical data to back up this claim especially.

  • jedberg an hour ago

    One of the things I loved about working at Netflix was that the base assumption was that everyone was a top performer. If you weren't a top performer, you were given a severance check.

    The analogy we used was a sports team. Pro sports teams have really good players and great players. Some people are superstars, but unless you're at least really really good you're not on the team.

    Performance and compensation were completely separate, which was also nice. Performance evals were 360 peer reviews, and compensation was determined mostly by HR based on what it was costing to bring in new hires, and then bumping everyone up to that level.

    So at least at Netflix 10 years ago, performance wasn't really distributed at all. Everyone was top 10% industrywide.

    • brabel an hour ago

      It's really difficult for me to believe that they really got 10% top performers. For one, knowing the cut-throat nature of employment there, I would expect only a minority of developers would be willing to try working there, despite the awesome rewards.

      Another reason I really don't trust that to be true is that I've never seen a good way to measure who is a top performer and who is not. I don't think there's one, people are good in different things, even within the same job... for one assignment, Joe may be the best, but for another, Mary is the winner (but again, to measure this reliably and objectively is nearly impossible IMHO for anything related to knowledge work - and I've read lots of research in this area!).

      Finally, just as a cheap shot at Netflix, sorry I can't resist as a customer: they absolutely suck at the most basic stuff in their business, which is to produce good content in the first place, and very importantly, NOT FREAKING CANCEL the best content! I won't even mention how horrible their latest big live stream was... oh well, I just did :D.

      • kube-system an hour ago

        > the most basic stuff in their business, which is to produce good content in the first place, and very importantly, NOT FREAKING CANCEL the best content!

        It isn't that simple. Making money from content is not 1-to-1 related with the quality of the content. There are many examples of great content that doesn't make money, and many examples of content that makes a lot of money that isn't great. Also there are many differing opinions on what 'great content' even is.

        • echelon an hour ago

          It's an increasingly bad business to be in.

          Netflix burns customers when they cancel beloved shows, and they constantly have to experiment.

          They now have a bazillion competitors who are ramping up comparable businesses. There's no moat or secret sauce competitive advantage. Customers are free to switch at no cost.

          Bigger tech companies are using media content as simply a fringe benefit or commodity to enhance their platform offerings.

          YouTube, on the other hand, is already starting to eclipse the entire Netflix business model. YouTube is a monster with a huge and enviable moat, and it's only going to continue growing. It's a much stronger business model and they have a sticky and growing user base.

      • creer 35 minutes ago

        > difficult for me to believe that they really got 10% top performers

        It's difficult to achieve, but it's not an unreasonable objective to have. After that there is a question of measurement. How do you measure that? Did they? What was their score? - and yes, until the evidence is released, they probably didn't. (But I would also cut slack on the measurement - it IS difficult to measure so a decent attempt - a top 10% attempt? - will do.)

        Where the "top performers" meme obviously fails is when every new business and their sister claims the same thing. We are all winners here and all that.

      • exe34 an hour ago

        I think it's safe to assume gp has drunk the koolaid. I spoke to somebody from the army once, and they too had the top 10% and it's difficult to imagine that every employer employs the top 10%. it's a cultural meme really, like everybody tells themselves they are good people really.

        • jajko 16 minutes ago

          At some point, people invest into their work/employment so heavily and tie it to their identity tad too much, they internally need to feel this is the right and best choice, which for many top talents may mean working with "top 10%", whatever that means. So otherwise smart folks will start parroting official company policies and become a 'good boy'. Suffice to say I don't look kindly on this, but it highly depends on the business.

          I've heard similar claims many times before, albeit mostly not from places paying so much. Ie at university, there was promotion seminar from Accenture branch in our country, the guy was some higher manager and stated the same, how they want only the best of the best and work hard getting and maintaining this. Then maybe 10 years later I had 20 of them as contractors and reality was not that rosy, huge variation from good to terrible.

          • exe34 3 minutes ago

            I love my job, but I'm careful not to give the impression at work. Best to keep them on their toes. I'm also good at weaving the corpospeak into conversations, but very few can hear the sarcasm.

    • Eikon an hour ago

      How are 'top performers' and 'low performers' being defined in this context?

      In my experience, these labels in corporate environments often correlate more with social dynamics and political acumen than actual work output. People who are less socially connected or don't engage in office politics may find themselves labeled as 'low performers' regardless of their actual contributions, while those who excel at workplace networking might be deemed 'top performers'.

      The interview process of these kind of companies also often falls into a problematic pattern where interviewers pose esoteric questions they've recently researched or that happen to align with their narrow specialization from years in the same role. This turns technical interviews into more of a game of matching specific knowledge rather than evaluating problem-solving abilities, broader engineering competence or any notion of 'performance'.

      Let's be honest: how many people can truly separate personal feelings from performance evaluation? Even with structured review processes in place, would most evaluators give high marks to someone they personally dislike, even if that person consistently delivers excellent work?

      • efitz 29 minutes ago

        > problematic pattern where interviewers pose esoteric questions they've recently researched

        The days of the “brain teaser” interview question are gone, at least from the “magnificent 7” and similar big tech companies. Nowadays it’s coding, behavioral, and design, at least for engineers.

        I concur with the sentiment that performance ranking has a very significant social component. If you have a bad relationship with your manager, watch out. But also, if your manager has a bad relationship with THEIR manager, or are not adept at representing their employees, you can get screwed too.

    • yreg an hour ago

      A bit offtopic, but I've been curious about this.

      Could you please describe how the unlimited vacation policy worked? How did people feel about it and whether they were anxious regarding using it (afraid that it will reflect on them badly when they take "too much" time off)?

    • dangus 43 minutes ago

      In summary, Netflix told all their employees that they are so amazing at their job, they are the top 10% of the whole world, they are like NFL athletes. If they don't perform to top tier levels, they'll be shown the door.

      Here's a thought experiment: pretend that Netflix is lying and that their employees are not actually made up of the top 10% of talent industrywide. Let's for this thought experiment assume the realit is that they have slightly above average talent because Netflix pays slightly above industry average.

      But now they've convinced those employees that they're not just slightly above average, they are like elite NFL players. And that means they have to work like elite NFL players. Netflix convinces their employees to work XX% harder with longer hours than the rest of the industry because they think they are elite.

      "Only amazing pro athlete geniuses can work here" is way more motivating than "You have to work yourself to death with extra hours to make quota or you're fired!" because it's a manipulation of the ego.

      I think this thought experiment is closer to reality than Netflix or their kool-aid-drunk employees will admit, and that Netflix's "pro athlete" culture is worker-harming psychological manipulation.

      • vineyardlabs 25 minutes ago

        The interesting thing about this thought experiment is that you assume Netflix would have slightly above average employees if they have slightly above average compensation. Now what happens to the experiment if Netflix has ridiculously above average, end of the bell curve compensation (as they do)? Serious question, I do not and have not worked for Netflix.

      • jonas21 39 minutes ago

        Most Netflix employees have worked at other places and can make the comparison for themselves. They don't have to take Netflix' word for it.

        Also, since when is telling people they're good at what they do "worker-harming psychological manipulation?"

        • dangus a minute ago

          It’s psychological manipulation when it’s being used as an excuse to fire and replace reasonably productive people.

        • MilanTodorovic 26 minutes ago

          My guess would be that it nurtures the imposter sydrome once the "top performer" starts struggeling with something they shouldn't if they truely were a top performer.

    • thifhi an hour ago

      > Performance and compensation were completely separate, which was also nice.

      Huh? How is that nice? Does performance and compensation not correlate in your ideal world, or am I misunderstanding it?

    • haolez an hour ago

      How can 360 peer performance reviews ever work? The incentives are against a fair evaluation: the reviewers have the incentive to overly criticize others so that they can stand out more.

      I'm not saying that everyone on a 360 review process does that. But the incentive is there and it's working against fair reviews.

      • stonemetal12 10 minutes ago

        >The incentives are against a fair evaluation: the reviewers have the incentive to overly criticize others so that they can stand out more.

        Wouldn't that(how you view and fit in with your team) be part of your review? If I was Bob's manager and all reviews he gave of his teammates were "Teammate M is a dumbass and the only reason they are productive is because I do 80% of their job for them", wouldn't leave me thinking Bob is great. It would leave me thinking Bob is a jerk who doesn't work well with others.

  • hemloc_io 2 hours ago

    Cool data/idea, and anecdotally lines up with my experiance at BigCos from a coworker perspective.

    But in my experiance employee perf evals are more political than data based.

    At the end of the day a lot of mgmt at BigCo, esp these days, wants that 10% quota for firing as a weapon/soft layoff and the "data" is a fig leaf to make that happen. More generously it's considered a forcing function for managers to actually find underperformers in their orgs, even if they don't exist. Either way it's not really based on anything other than their own confirmation bias.

    IME the scrutiny of perf evaluation is basically tied to the trajectory of the company and labor market conditions. Even companies with harder perf expectations during the good times of ~2021 relaxed their requirements.

  • riazrizvi 2 hours ago

    This is a well constructed empty argument because it glosses over the central concern, ‘employee performance’. Without defining that we have no idea what the graph represents.

    • bhickey 2 hours ago

      For analyses like this it just doesn't matter. Pick a metric and measure it over your workforce. Across the universe of salient metrics of interest you won't see a gaussian across your workforce.

      In a previous job I modelled this and concluded that due to measurement error and year-over-yead enrichment, Welchian rank-and-yank results in firing people at random.

      • pembrook 2 hours ago

        All of Jack Welch’s management tactics should be considered suspect now.

        His performance at GE was 100% fueled by financial leveraging that blew up in 2009, basically killing the company. Nobody should be taking management lessons from this guy.

        • lotsofpulp an hour ago

          > Nobody should be taking management lessons from this guy.

          Rank and yank is simply about lowering labor costs, once the business has achieved a significant moat and no longer needs to focus solely on growing revenues. A negotiating tool for the labor buyer, due to the continuous threat of termination.

      • bhouston 2 hours ago

        Stack ranking will tell you when something isn't working, but the solution isn't always to fire, but rather use that data to fix things in a more general solution.

        I found that team composition and role assignment matters a lot, at least if you hire people who are at least above a certain bar. Match a brilliant non-assertive coder with someone who is outgoing and good at getting along and at least decent coder, and the results from the two outperform generally either of them individually.

        You can bring out the best of your employees or you can set them up against each other. This either brings everyone up or brings everyone down.

        • dataflow an hour ago

          Wholeheartedly agree with you on team composition mattering a ton, but how often do you have such an abundance of engineers and tasks that you can match them up the right way?

          • bhouston an hour ago

            I think if you get to know your engineers, you can figure out the right pairings to bring out the best. But this requires intimate knowledge and probably subjective based on how good the manager is at managing coders. So I guess from up high, stack ranking-based firing is easier.

            But I think it is also cheaper to make great teams rather than just doing brutal firings all the time. But it may be a micro-optimization?

      • Cheer2171 2 hours ago

        So you're saying that if you don't think about construct validity and just pick any given metric that can spit out a comparable number across all your different positions and teams, that these metrics have weird distributions? Hmm, I wonder why.

        • munk-a an hour ago

          I think it's more charitable to interpret their statement as "for all metrics" rather than "run this experiment once and arbitrarily just chose a single metric". Their statement is a lot more actionable because as much as we've tried to over decades finding an accurate metric to represents performance seems to be an impossible task.

          A researcher friend at a previous job once mentioned that in grad school he and several other students were assisting a professor on an experiment and each grad student was given a specific molecule to evaluate in depth for fitness for a need (I forget what at this point) and one of the students had a molecule that was a good fit while the others did not - that student was credited on a major research paper and had an instant advantage in seeking employment as a researcher while the other students did not. That friend of mine was an excellent science communicator and so fell into a hybrid role of being a highly technical salesperson but tell me - what metrics of this scenario would best evaluate the researchers' relative performance? The outcome has a clear cut answer but that was entirely luck based (in a perfect world) - a lot of highly technical fields can have very smart people be stuck on very hard low margin problems while other people luck into a low difficulty problem solution that earns a company millions.

          • withinboredom an hour ago

            Most of the world is ruled by luck. Where you are born, who your parents, how rich they are, who you know, whether or not someone “better” than you applies for the same position, etc. etc.

            Ignoring luck or trying to control for it would be a mistake.

    • timdellinger 2 hours ago

      Oh, the answer to that is apparent enough, but frustratingly circular:

      Performance is "visibly doing the things that the company rewards during the performance review process".

      Theoretically, each role at a company should have a set of articulated accomplishments that are expected. (This is sadly often not the case.)

      But you're right that the subjective nature of "performance", and the lack of a clear numerical scale, are a difficulty of the entire process!

    • alphazard 2 hours ago

      You could replace "employee performance" with "value to the company" and the same argument would hold. Performance is difficult to measure, but we get a good estimate of value to the company any time someone receives a competing offer and drags their manager to the negotiating table.

      The amount of money the manager is willing to match is the perceived value to the company. This is how the company actually behaves (we know for sure whether they match the offer or not) and that behavior implies a value to the company, regardless of what anyone says in performance review season.

      • dataflow an hour ago

        > The amount of money the manager is willing to match is the perceived value to the company.

        This assumes the manager is irrelevant here. But we all know that different managers (or non-managers) can communicate value differently for the same employee. So this metric can't be solely measuring the value of the employee.

        • alphazard an hour ago

          You are talking about value as some intrinsic quality. I'm talking about value as a belief that is subjectively assigned, and that we can infer from actions. We can all agree on the actions, and we can agree on the possible beliefs that an action can imply.

          The action to not match an offer implies that the company believes the employee adds less value than their new offer. If the company believed the employee was adding more value than their new offer, they would match the offer to keep the employee.

          A company isn't a single rational agent. It's made up of people performing different functions. But behaving irrationally is a categorically bad thing for the company to do, and the leadership has a fiduciary duty to prevent the company from acting irrationally or otherwise not in its own self interest.

          The manager may matter here, but the leadership is supposed to be creating a management structure such that the company acts rationally to make progress towards set goals.

    • mitthrowaway2 2 hours ago

      The article does briefly caution about measuring difficulties. But given that the main conclusion is an argument against stack-ranking-and-firing, the question of "what is performance" passes forward to whatever metric the stack-ranking manager was going to use when they were planning to fire the "bottom" 10% of their payroll.

    • michaelmior 2 hours ago

      I'm not sure this is the argument the author is making, but you could claim that the rest of the argument is true for any (or most) reasonable measure of employee performance that a company actually cares about.

      • nradov 2 hours ago

        You could claim anything, but is there hard quantitative data to support such a claim? Or are we just guessing?

        • michaelmior 9 minutes ago

          The author presents some data in the article. Also, the absence of hard quantitative data doesn't necessarily make it a complete guess. (At least not any more than starting with the assumption of a Gaussian distribution.)

    • BiteCode_dev an hour ago

      Yes, is performance Pareto, or perception of performance Pareto?

    • SideburnsOfDoom 2 hours ago

      It also assumes that "productivity" is something that is meaningful at all at the level of individuals, not teams or larger. IMHO, it is not.

  • _vaporwave_ 4 minutes ago

    > a helpful order of magnitude estimate is that the hiring process all told costs the company approximately a year’s salary

    It feels weird to gloss over this since transaction costs this high have a huge impact on how the system should be designed.

  • crazygringo an hour ago

    This is very unconvincing. The author already admits one reason why:

    > But there are low-performing employees at large corporations; we’ve all seen them. My perspective is that they’re hiring errors. Yes, hiring errors should be addressed, but it’s not clear that there’s an obvious specific percentage of the workforce that is the result of hiring errors.

    I think it is clear that we expect a certain percentage of hiring "errors". And that they are not binary but rather a continuum. And that there are lots of other factors like employees who were great when they were hired but stopped caring and are "coasting" or just burnt out, who got promoted or transferred when they shouldn't have been and are bad at their new level/role, and so forth.

    The Pareto distribution isn't particularly relevant here, because a hiring process isn't trying to get a whole slice of the overall labor market with clear cutoffs. For any position, it's trying to maximize the performance it can get at a given salary, and we have no reason to expect the errors it makes in under- and over-estimating performance to be anything but relatively symmetric.

    So a Gaussian distribution is a far more reasonable assumption than a slice of the Pareto distribution, when you look at the multiplicity of factors involved.

    • wavemode an hour ago

      > So a Gaussian distribution is a far more reasonable assumption than a slice of the Pareto distribution

      It's not an assumption. See the evidence referenced in the footnotes.

    • dheera an hour ago

      Personally I think manager/report mismatches are far greater than hiring errors.

      When A doesn't like B it doesn't mean A or B are necessarily unfit to work at the company, but it generally results in the subordinate being framed as underperforming or not being given the resources to perform.

  • wavemode 18 minutes ago

    This concept is not new - see [0].

    There's ample research that Welchian stack ranking, and assuming a Gaussian distribution of employee performance, is not well-founded. Even its original pioneers (General Electric) have abandoned the practice (see [1]).

    Not sure why there are so many commenters here defending the Gaussian model. Most researchers at this point agree that a pareto distribution is more realistic.

    [0]: https://hbr.org/2022/01/we-need-to-let-go-of-the-bell-curve

    [1]: https://qz.com/428813/ge-performance-review-strategy-shift

  • iambateman 2 hours ago

    As employees, our expectations for performance management come from the system of giving grades in school.

    What's interesting is that school grades often doesn't follow a normal distribution, especially for easier classes. I suspect that getting an "A" was possible for 95%+ of students in my gym class and only 5-10% of the students in my organic chemistry class.

    In the same way, some jobs are much easier to do well than others.

    So we should expect that virtually all administrative positions will have "exceptional" performance, which is to say that they were successful at doing all of the tasks they were asked to do. But for people who's responsibility-set is more consequential, even slightly-above average performance could be 10x more meaningful to the company.

    • atoav 2 hours ago

      One thing where this analogy stops to work, is that more so than in school your performance in a company can be highly dependent on how well and/or timely others do their job. Your managers performance metric may or may not catch that. E.g. imagine you are assigned a project where you have to interact a lot with department X and now department X is running at/over capacity, so you are performing worse, because their part isn't done in time and each back and forth takes half a week. Now you spend half your time not being productive with no fault of your own and the others are 110% productive while setting the whole shop on fire. Based on that metric they should fire you and hire more people for department X, when in fact they should probably just hire more for them (or reorganize the department).

      Another example where this analogy stops working is that in school the students usually get the same/comparable assignments, that is somewhat the point of those. As the goto hard-problem-person at my current workplace I am pretty sure that it is absolutely impossible to compare my work to the work of my collegue who just deals with the bread and butter problems, it isn't even the same sport. How would you even start doing a productivity comparison here, especially if you understand 0 about the problem space

      • iambateman 18 minutes ago

        Great perspective and I agree. This is the basic reason that performance management in an organization is so difficult and fraught.

        A significant percentage of people in an organization create the problems they solve.

    • nightski 2 hours ago

      Having a shifted mean doesn't mean they aren't a normal distribution. Not saying they are necessarily, but the anecdote you are providing isn't convincing.

      • kurthr 2 hours ago

        Perhaps, but due to the sampling of the distribution you would likely never know. If 95% of your samples fit in the top 3 bins, you can’t say much at all with certainty. Poisson, Gaussian, binomial, Boltzmann, gamma…

      • marian_ivanco 2 hours ago

        That is not IMHO what he is trying to say, you don't shift the distribution, you measure if somebody passed a test. I the test is "passable" then one side of "distribution" is at least cut off. E.g. it's normal (and sometimes expected) that the whole class will pass without issues.

      • dowager_dan99 2 hours ago

        if your scale doesn't have the atomic values at the top end to differentiate the data it's not a normal, it's Pareto or Zipf or some other power law.

    • sokoloff 2 hours ago

      Would “doing all of the tasks they were asked to do” really be “exceptional”? What could be exceptional about that? I would think it would be “meets expectations” at most.

      • iambateman 19 minutes ago

        I have an issue with this thinking, but I don't mean to pick on you...it's common within organizational politics.

        Managers suggest that an employee must "go above and beyond" their ordinary duties to get an exceptional rating.

        But that just means that "going above and beyond" is, in fact, a duty. The problem is it's an ill-defined duty which is even more susceptible to the whims of what the manager thinks counts as "above and beyond." Good managers give clear rubrics of performance.

        To me, "meets expectations" says that the employee's error rate was at acceptable levels and "exceptional" means they had almost no errors whatsoever.

      • dowager_dan99 2 hours ago

        You don't really need a distribution to measure tasks that are binary in nature though, why bother with a Likert scale when you can just use a yes/no checklist? I suspect there's also a high correlation between the jobs/roles and the likelihood of being displaced by machine or otherwise, as measuring success is a key problem to be solved when "dehumaning" these jobs.

  • doctorpangloss 2 hours ago

    This article: "Wouldn't it be cool if when you measure employee performance, it turned out to fit a Pareto distribution better than a Gaussian?"

    Would that be cool? We could posit the implications of all sorts of improbabilities. But I feel more strongly about how cool it would be that P = NP.

    All this aside, being laid off sucks - being pushed out, even when you're a high performer, sucks even more. The truth is that "data science" does not help you process grief the way reading Dostoevsky does, so maybe getting an A in your liberal arts education is valuable even when you are working as a software developer.

  • dogleash 2 hours ago

    To me the biggest insight here is that no matter what data science you're trying to do on a group of employees, the people you already have decided should be fired or promoted from that group are outliers and should be removed from the sample.

    There are certainly times that you would want them included, but those can be classified under "budgeting," not gaining insight on a workforce.

    • ses1984 2 hours ago

      Doesn’t the inclusion or exclusion of these people heavily depend on what type of insight you’re trying to get out of the data?

  • jampa 2 hours ago

    Going through some performance reviews as a manager, I always try to push back a bit against the bell curve. It kinda reminds me of the "stack ranking.". There are also some factors to be considered:

    If you are in a hiring freeze or not promoting, most of the curve should shift right, assuming you are hiring great people. They will probably perform better quarter after quarter. Some might counter-argue that if everyone performs better, this should be the "new expectation," but I disagree: the market sets expectations.

    If you have someone at a senior level with expectations of staff, for example, they won't be in the company for long. I hired many great engineers who later said they only looked for a new job because they were never promoted despite being overperformers.

  • seiferteric an hour ago

    A lot of focus on employee performance, but relatively little on management performance. I always wonder how a once great company can slowly decline into irrelevance. Take yahoo for example, it could only be due to management failure over several decades right? How can companies optimize for management performance?

    • bornfreddy 24 minutes ago

      Firing 10% each year would be a great start in many companies. ;)

  • TrainedMonkey an hour ago

    Employee performance MEASUREMENT appears to be Gaussian distributed. To my first simple, and let's be real probably somewhat wrong, approximation there are roughly 3 things that go into it.

    1. There is a certain skill in communicating all the important things you've done, we shall lump likability + politicking into this one for convenience.

    2. There is a premium that is placed on shiny new features and saving the day heroics. A lot less priority is placed on refactoring and solving the problems before they require heroics.

    3. Finally there are individual's technical and self-management skills. I.E. it's important to work on important things and be good at it.

  • throwaway48476 2 hours ago

    Setting aside the issue of defining a function for 'employee performance', this glosses over the invisible interactions. An employee in a dysfunctional organization will perform worse than if they were in a well functioning one because they don't have to waste time dealing with people and processes that are a hindrance.

  • philipov 44 minutes ago

    > How much revenue do you think a janitor or café staffer generates? Close to zero. The same goes for engineering. Someone has to do the unglamorous staff, or you end up with a dysfunctional company, with amazing talent (on paper).

    If the company would be dysfunctional without that janitor or software engineer, and not bring in as much revenue as a result, it sounds like the model that attributes close to zero revenue to them is already dysfunctional. If the company can't function without the janitor, then a significant portion of the revenue of the company should be attributed to them.

    • sangnoir 38 minutes ago

      Sound like you're expecting employers to strive for fairness. Instead, they are striving for profits for the capital class. The labor class gets the minimum possible amount to reach the shareholders primary goal.

      • philipov 34 minutes ago

        It sounds like you're confusing what they do currently and what the system should be set up to encourage instead. That things are broken right now is not a valid argument in favor of the status quo. The point you make only proves why it is so important that unions should have as much economic power as corporations do, so that the buy and sell sides of the labor market have negotiation parity.

        • sangnoir 31 minutes ago

          I'm being descriptive, bot prescriptive: I'm stating what the priorities are under a capitalist system without the rose-colored glasses offered by the Just-world fallacy.

          • philipov 28 minutes ago

            In a well-functioning capitalistic system, the sell side of the labor market has equal power with the buy side. When the buy and sell sides of a market have a huge power imbalance, this leads to market failure, which is contrary to the goals of a capitalist system, as it results in inefficient allocation of capital.

            • sangnoir 15 minutes ago

              Where can one find examples of such a well-functioning capitalistic system? Or is it a thought-experiment

  • wing-_-nuts 2 hours ago

    One reason I'd never work for a company with a 'bottom 10% gets PIP'd' mentality is that it directly conflicts with my goal of self development. Of course I want to be on a great team where everyone performs better than I do. That's how I hone my craft! It just seems really wasteful to have to cull the bottom 10% of every team, even if that team is performing well. I wish there was a list of companies that subscribed to that mentality, so I could avoid them.

  • warrentr 2 hours ago

    In the work rules book about google, Bock claims (apparently using a lot of real data from google) that employee performance follows a power law distribution.

  • irrational 41 minutes ago

    Is it Q4 at a lot of companies? How many companies align their fiscal calendar with the yearly calendar? Our Q4 is March-May.

  • spyckie2 27 minutes ago

    So…

    1) treat poor performers as bad hires and ignore them in your dataset

    2) treat 10x performers as needing to be promoted and also ignore them in your data

    3) treat everyone else as relatively equal

    …and use “Pareto distribution” and “no one has mentioned this before” to write a blog post?

    Is the point of the article to get people who disagree with 10% corporate culling a pseudo intellectual economic buzzword argument to stroke their hatred of an inefficient hr practice? If so:

    1) 10% culling in performance review is a mechanism to cull “bad hires”. I find it difficult to understand how the author can argue it’s a bad practice and then state that you cull bad hires from your dataset without thinking that they are the same thing or at least largely overlapping.

    2) If the author is proposing to separate performance review, culling bad hires, and promotions, into 3 separate systems and assume no overlap, he should think through the structural issues more. While it’s possible to design a management structure where the organization is at a constant state of no bad hires, all 10xers promoted, that is putting a lot of responsibility on individual managers to run review, culling and promotion by themselves at a very high level. It’s brittle - a few bad managers not running the system can easily leave your organization bloated with bad hires and no fallback (fallback = performance review process).

    3) The system of performance review is equally about risk management to the business as it is about rewarding your employees. IMO, the author’s framing simplifies the problem too much and pushes the complexity out for other people to deal with. It’s the kind of thinking that is damaging to organizations… I wonder if there is a process to cull this kind of thinking from your org… wait what time of year is it??

  • uoaei 9 minutes ago

    I suspect you can dig into any metric here and find that they are explicitly determined in terms of an assumption of underlying normality.

  • xmly an hour ago

    Well, managers are trying to make it Gaussian, but underlying is actually power law.

  • dmurray 2 hours ago

    > For what it’s worth, human height is also Gaussian, and that’s correlated with workplace success.

    Height is generally not considered to be Gaussian and this is exactly the kind of statistics mistake the author seems to be accusing employers of. Adult height is somewhere between Gaussian and bimodal.

    • timdellinger 2 hours ago

      Fair enough.

      Perhaps better stated as "adult human height is approximately Gaussian for a given biological sex", with an asterisk that environmental factors stretch the distribution.

      I love the anecdote that people born in the American colonies came back to England to visit family, and were remarkably taller compared to their cousins due to environmental factors.

  • nonameiguess 2 hours ago

    It's worth hammering on this point as much as possible hoping a few people listen, but there is at least one other important point about employee performance. If you're allocating bonuses, a single year's performance is probably a good way to do that, assuming you can accurately measure it. When you're talking retention and promotion, though, you're making a prediction of future performance, possibly at a variety of different jobs. That is even harder to do and more poorly reflected in the last year's results. You have some analogies to sports performance in this article, and you see this kind of thing all the time there. Guy does great in a single year, gets a huge, possibly long-term contract, then tanks. On the other hand, one of the better dynasties of the past decade was accomplished by the Golden State Warriors in the US NBA thanks to underpaying one of the all-time great players in NBA history because he suffered a series of ankle injuries early in his career and scared off other suitors. Single-year performance isn't necessarily reflective of a person's true mean abilities, and their place in the Pareto distribution won't be the same at all levels of advancement and responsiblity, either.

    The problem, from a company's perspective, is you probably need to retain everyone at least five years, and actually give them a wide variety of assignments in that time, to really get any usable data about their long-term prospects.

    • stego-tech 2 hours ago

      Literally this. I’ve been banging on about this my entire career, not that corporate leaders tend to listen to the riff-raff. Especially in tech companies, they tend to only evaluate promotions and raises based on the past half-year of work, rather than a repeated pattern of successes across a diverse array of tasks and backgrounds over a significant period of time (years); even then, you only get the promotion if you’re on the right team, doing the right work, at the right time, and for the right leader. This leads to otherwise stellar performers going elsewhere, because the janitors, maintainers, and firefighters in an organization never get properly rewarded, respected, or recognized by leaders. Said leaders pass this off as “bad performers”, failing to realize the importance of superb talent working on less-than-stellar projects that keep the company running efficiently.

      The only people who benefit from performance reviews are shareholders whose price pops when layoffs happen, and those who game the system for their own political ends. Top talent never really thrives in these, because they’re too busy doing actually meaningful and important work.

    • hermanradtke 2 hours ago

      > On the other hand, one of the better dynasties of the past decade was accomplished by the Golden State Warriors in the US NBA thanks to underpaying one of the all-time great players in NBA history because he suffered a series of ankle injuries early in his career and scared off other suitors.

      In case people want to read more about this:

      https://www.essentiallysports.com/nba-active-basketball-news...

    • timdellinger 2 hours ago

      Interestingly enough, sports salaries are Pareto-distributed, which says something about how valuable (as assessed by the marketplace) each player is

      https://marginalrevolution.com/marginalrevolution/2024/08/go...

  • datadrivenangel 2 hours ago

    If you assume that people are promoted to their level of incompetence -- terminal responsibility level, then you would expect that level adjusted performance should approach a Gaussian?

    • riehwvfbk 2 hours ago

      No, because there simply aren't enough high-level employees at the top in any given company for a meaningful sample. You'd have to compare across companies; I guess the stock market does that indirectly.

  • 29athrowaway 24 minutes ago

    I guess developers should have a pay structure similar to sales when you make part of your money from bonuses tied to results. But those results are hard to evaluate because shipping something fast can have bugs found after the reward date.

  • Joel_Mckay 2 hours ago

    "Hey wait - is [arbitrary metrics] Gaussian distributed?"

    =3

  • bparsons 2 hours ago

    Unless you are measuring the output of people on simple assembly lines, it is very difficult to define "performance".

    In a properly functioning team, people perform different, discrete roles which are probably not entirely understood by other team members or management.

  • AtlasBarfed an hour ago

    1) performance reviews are never aligned with employee value, because companies are strongly invested to take excess production from employees and transfer it to management, secondarily shareholders

    2) the are also not aligned with the replacement cost of employees because the religion of management is that labor is effortlessly replaceable and low value

    3) employee retention is not aligned with corporate performance in Machiavellian middle management, it is aligned with manager promotion for things like loyalty and maintaining fiefdom power, budgetary size, headcount, etc

    4) there are no absolute or ever directly derived metrics in software development that have ever worked, to say nothing of other positions

    Those are off the top of my head.

  • morkalork 2 hours ago

    If you ever look at tranditional human-driven sales data, you'll often see a small percentage of top performers absolutely dominating the total sales volume. So yes, employee performance is not Gaussian at all.

  • xphilter 2 hours ago

    Yeah good luck. I don’t think any hr decisions have ever been about data; it’s about following norms. If you can get the rand corp or heritage foundation to adopt this policy then maybe corporations would look into it.

    • timdellinger 2 hours ago

      Interestingly enough, I remember in my younger days being inspired by Rand Corp's 1950's era game theory work on e.g. mutually assured destruction. It later occurred to me that I don't need to be employed by a think tank to write think pieces!

      That being said, I like to think that startups growing into large corporations have an opportunity to be better when it comes to things like performance management.

      • hobs 2 hours ago

        As soon as the market actually incentivizes it, which it almost never does, it will get better.

        Most of the big companies just throw endless interviews, high pressure firings, and a lot of money at the problem and make the people below them solve the rest of the problems.

        They see how much they are paying for the mess, but any medium term effort is torpedoed because of all the other things the business focuses on (lack of resources for the process and training), and other powerful individuals who want to put their own brand on hiring and firing who have significantly more ego than sense.

    • thrance 2 hours ago

      The Heritage Foundation would probably fire every competent employees and replace them with partisan sycophants, like they plan to do with America in Project 2025.