Pre-2022 Books

(notes.lorenzogravina.com)

123 points | by trms 2 hours ago ago

60 comments

It's one of the reasons I don't want to update my free book about Ruby: https://leanpub.com/rubyisforfun - written by a human, and will stay the same forever I think. The moment you touch it to update - immediately changes the date from 2022-05-26 to 2026 and all the value is gone

zerobees an hour ago

I've been consciously doing that for reference books for the past three years because Amazon is absolutely littered with AI-generated non-fiction. I have my own ideological reasons too, but the main problem is that most of that AI-generated reference stuff is just of incredibly poor quality. It's meant to saturate the platform as cheaply as possible, so no one actually does any fact-checking, editing, layout, and so on. They're not even using frontier models for that.

For example, there are multiple evidently AI-generated titles that come up on the front page if you search for "Rust programming", "cybersecurity book", etc. I guess I can't rule out that "Winston Knowles" is a real person, but I'm not gonna bet money on that: https://www.amazon.com/Cybersecurity-Career-Manual-Interview...

adamddev1 an hour ago

And there might be no way to prove you really wrote something Post-2022. I wrote a long article, all by hand. I never used any LLMs, even for searching. I checked it with a couple of AI detection tools and they confidently said that 60% of the article was written by AI.

[-]

algoth1 an hour ago

The thing is, llm token frequency was derived from human writings like yours, and rlhf for good writing practices, like the emdash. So getting ’detected’ on good writing it’s unfortunately to be expected. My broken ESL english is much safer for now

[-]

timacles 37 minutes ago

Our only hope is to start communicating like DevOps Borat

[-]

mannycalavera42 30 minutes ago

hats off

John7878781 an hour ago

Interestingly, you're actually more likely to be flagged as AI if English is your second language.

[-]

heffer an hour ago

That tracks with reality, as the majority of people don't have English as their first language. Depending on data sources used for training that could well reflect into AI detection tools.

mohamedkoubaa an hour ago

I track every change in the fiction I write in git. Its not hard proof but it at least shows how the prose evolved over time and is something like a proof of work.

[-]

arkaic 32 minutes ago

It's extremely easy to disprove that lack of hard proof too. One could've individually chatgpt'd every addition before commit for proofreading. The infection of AI just gets into every nook and cranny of the process because it's so easy to reach for it

[-]

mohamedkoubaa 30 minutes ago

Right, it isn't proof and I won't claim it is, but it's more diligence than I think is normal. I hope that counts for something.

raincole 44 minutes ago

Every time I attempted to convince people to not use Pangram on HN I got downvoted.

altmanaltman an hour ago

Most "AI detection" tools are BS, report AI writing for all writing. The current issue with AI writing is that it has a very generic, easy-to-spot style if you spend even a bit of time working with it. If everyone in the world spoke in the same manner, used the same punctuation, spoke in exactly the same catchphrases, the world would lose its richness, and that is the problem with AI writing and communications in general - it has 0 personality, and humans by nature engage with strong and unique personalities. Over time, people will realize the futility of using AI in creative writing, or AI will get really, really good at not just sounding human but being human.

Yet, from what I can see, AI writing is mostly used by people who don't know a thing about writing, and because they have bad taste, they do not see what's wrong with AI writing and put it out there.

At the end, you write for a purpose: for marketing copy, etc., you would require a different type of writing talent than something like writing a fiction book. But AI doesn't understand this nuance; it has only a default type of communication, which is highly optimized for being a chatbot. It is possible to write a very good text using AI if you have taste and you know what you're doing, but most people don't.

Similarly, a lot of vibe-coded apps are garbage, but because the people creating them lack software domain knowledge and don't even know what they don't know, they think it's good and put it out.

We have a massive problem here that's not just limited to writing - the promise of AI for the mainstream market is that you can replace domain-specific knowledge and have world-class execution in any vertical with just AI, but that's very overhyped imo and doesn't stop the people who don't have domain experience to try out stuff with AI and not realize what they made is a steaming pile of shit in reality.

dspillett an hour ago

Not just books. When searching for information online, for anything where things haven't changed significantly in the last few years, I definitely favour a post on SO/SE/HN/reddit/etc dated before 2023 over those that are later. And where there are no good looking references before that, the earlier the better.

Of course there are no doubt people out there realising that a fair few of us do this, and are starting to edit posts to pre-date them as a sort of SEO trick…

[-]

golem14 an hour ago

In this case, archiv.* might be your friend, since they have time stamped copies.

YesBox an hour ago

You are not alone. I like to read Harry Potter fan fiction [1] and I have started checking the publication date when Im searching for something new to read. I started doing this passively and realized it after the fact.

Have you ever met someone who could say all and do the right things but never made you feel anything, or your gut was sensing an ulterior motive? It's a magic trick we are all bewitched by at some point in our lives. I suppose I filter by published year because I dont want think about if I am being tricked or not.

[1] There are some very talented writers[A] out there who (I assume) cannot do the world building part.

[A] Recent Favorite: https://archiveofourown.org/works/1134255/chapters/2292768

[-]

idleplant 16 minutes ago

With Harry Potter (and Star Trek, I believe) fanfiction, I think decline in quality has been typical for the last decade; the fanfiction from the 90s/early 00s is often of much better quality on average just due to the age + other factors affecting who was able to type up and post fanfiction online back then.

tapland 28 minutes ago

There are people churning out 40k word novels daily on AO3, and they get the eyes and feedback that up and coming writers desperately want. Real content is being drowned out and ppl are hurting because of it.

julianeon 7 minutes ago

This will work for a little while but this can't go on indefinitely: no one is going to want to stick to 30+ year old books.

We should probably work on developing standards for what we want in a book instead of clinging to a losing position.

mvkel 2 minutes ago

The glasses are a little rose-colored here. As if anything written <2022 were on stone tablets. LinkedIn slop was human generated before it was ai generated, but it was still slop.

The issue with content >2022 isn't that it's ai-generated per se, it's that it's still slop.

The day ai generates non-slop (it's coming), we won't complain. Nor should we.

Avicebron an hour ago

Props to the author for not mentioning low-background steel.

[-]

rzzzt an hour ago

Sounds like a pink elephant exercise. Low-background steel has now creeped back into our collective consciousness.

gritspants 11 minutes ago

Some on my bookshelf:

> Crafting Interpreters, by Robert Nystrom

> Re[Coding] America, by Jennifer Pahlka

> Systems Performance, by Brendan Gregg

cryo32 an hour ago

It’s not just 2022 and earlier books. There’s a supply chain problem as well. I’ve seen two older books so far from Amazon which were AI generated copy text with a genuine looking cover on it. Amazon just took the return and probably restocked it for the next victim.

I tend to buy books from second hand book shops and eBay now and usually older or well used copies. A good sign of their authenticity.

andy99 an hour ago

I’m pretty conservative with books and usually only read things based on recommendations anyway. I would rarely read a book published in the last few years just because news of it hasn’t travelled to me yet. I think worrying about AI generated books would really only matter if you’re at the bleeding edge of reading and looking for brand new stuff.

fallat an hour ago

There has always been shit books - let that sink in.

There will be _more_ shit books now, but that's the only difference.

There will be probably a constant rate of "good" books.

[-]

vibcdingenjoyer 16 minutes ago

The same happened to music when digital music production at home became more accessible. The hard part is separating the wheat from the ever growing larger share of chaff.

mrandish 15 minutes ago

I hate slop as much as anyone else and was, until recently, absolute in my zero tolerance of any LLM use in personal posts and correspondence. Now I've adopted a slightly more nuanced view because I sometimes use an LLM when writing but only in a couple very limited, narrowly focused ways. The first is trying to remember a specific turn of phrase I can't recall at the moment (when you get past 50 this happens a bit more often). The other is breaking up overly long sentences, which is a bad writing habit I've struggled with since high school.

I never let an LLM write or rewrite a post, or even a paragraph, for me. I want to write it myself and I want it to be in MY voice. I think I'm a pretty good writer and I like my writing. However, I suspect those who may be less confident in their writing use an LLM to "check" their rough draft but then succumb to the temptation of just pasting the LLM's output because it "sounds better", it's already finished and... writing is hard. This is always a mistake and no one should do it in a forum like HN. It's rude and we'd much rather hear your words and ideas as you express them.

The sad part is this ends up in an all or nothing between "Never use an LLM when writing a post" and "Have LLMs write posts for you."

unknownian 9 minutes ago

I’m vehemently anti-genAI in creative fields but even I think this metric is stupid and unfair to younger generations and generations to come, even if genAI writing gets better.

If you can’t trust a contemporary writer to not use genAI, then find another interest with true zero trust creative verification (like improv) because copying and cheating on writing has existed for decades.

However, my unpopular opinion is extremist in another way: societies should probably not tolerate AI in creative fields like creative writing at all, if Sora can be shut down, so can other useless and toxic parts of LLMs.

drchaim an hour ago

This will happen with social accounts, news articles..I set the date pre 2023, but we all have some date in mind. I don’t like it, but it’s what it is

raincole an hour ago

I feel that too. But the reasonable part of me knows that it's just one generation can't "get" the entertainment of the next generation. It has always been like that.

There are mobile game ads on TV here. My father asked me what actually the players get from paying the game companies money. He still doesn't get it after I tried to explain how it works twice.

[-]

coldtea 23 minutes ago

>one generation can't "get" the entertainment of the next generation. It has always been like that.

Given the quality of 2026's entertainment, looks like they had a point. And likely they had one at 2006 and 1986 and 1966 too.

bashmelek 39 minutes ago

I’m in my mid-30s, and have never written a book, but I still sometimes think of it. I know it isn’t too late. I still want to create my own applications, but I once used the Google ai result in a utility function. Is it all tainted? I still want people around me to try in earnest

[-]

atrus 23 minutes ago

It's replies like this that make me wonder which is more demotivating to artists. LLMs, or people screeching "it's ai" regardless of proof.

It has to be absolutely demoralizing to make something on your own, and have it immediately labelled ai by someone who can barely spell it.

wenbin an hour ago

If contents are generated instantly via llm and packaged as books, videos, podcasts, pull requests etc, then they don’t deserve human attention.

bonoboTP an hour ago

In good hands, it can be a great tool, but you usually don't notice that. The issue is that AI allows for a superficial appearance of quality and it takes time to discover that the content is void of deeper insight.

tyre an hour ago

Hot take: I think it’ll be pretty much the same as it was. If anything it will get better.

You will still have gatekeepers and taste makers. Publishing houses will screen fiction for well-written and interesting fiction. Word-of-mouth, personal recommendations, and endorsements from people you respect will continue to outweigh algorithms, if you care.

For cheap reads, how much of a difference is there between James Patterson’s 734th beach read thriller and what an LLM with a 50m token context window can produce? Does it matter that it’s not written by six ghostwriters? Probably not to the median Hudson News buyer.

For non-fiction, it’s easier to gather research and related materials. If you were cherry-picking facts to make a narrative, yeah, that’s easier, but it’s not like we haven’t gotten really good at that anyway. Again, there will be cooling off periods for scholarship to be debated and coälesce.

What will get better is people asking questions and getting well-researched pieces on a specific niche or confluence of topics. AI is just-good-enough-to-be-dangerous now. It will get better. We’ll learn to harness it (literally) to iteratively fact check and cite sources. We will build repositories with heavily sourced facts for it to build upon. It will be pulling together “truths” that can be traced, then incrementally adding inference across those, which can then be verified and are a new fact.

I read a lot. I love, love, love new and original authorship. I deeply value writing as a craft. There will be a lot of garbage. More than there is now, at an incredible rate.

And we’ll figure it out.

[-]

tyre an hour ago

My worry is less about scholarship than the next generation of readers and authors. It is too easy to be lazy right now. Too easy to skip the difficult work of struggling with ideas. Yapping with Claude probably (?) doesn’t have the same rate of retention and reinforced learning _in humans_ as digging through source material and writing by hand.

Growing critical thought, in my experience, has always been the much harder problem. Not sure we’re in for a good time on that front.

mikgp an hour ago

The James Patterson point is spot on and - to expand on your point, the internet arguably took the tastemakers / gatekeepers down a peg, AI could be what brings them back.

zeroonetwothree an hour ago

I don’t find this at all. Not that many fiction books use a lot of AI prose it seems. Maybe nonfiction is worse?

[-]

seliopou an hour ago

What facts are hanging that hat on?

bbg2401 an hour ago

I'm certainly observing AI smells from a high proportion of the books I read from O'Reilly and Packt since 2023. Authors don't attempt to hide it and some publish the work as if we didn't have a back catalog to distinguish the genuine article from a lazy prompt-driven manuscript.

I'm not seeing the same from the translated fiction works I've picked up in the same time period, thankfully.

cat-snatcher 27 minutes ago

No just you

psadri 19 minutes ago

We are witnessing the birth of the term "pre-times" you often see in dystopian sci-fi.

torben-friis an hour ago

I haven't seen any minimal sign that any of the fiction books I read lately was LLM helped. Writers seem like a particularly anti AI crowd too.

Has anyone? Now I'm curious if it's just my particular bubble.

[-]

bonoboTP an hour ago

What makes you think you'd recognize it? Do you work a lot with LLMs for fiction writing?

[-]

torben-friis an hour ago

More than you can imagine. OKRs, brag documents, quarter reviews....

Kidding aside, I would be surprised if something larger than using it as a thesaurus/corrector is slipping by. Literature is genuinely hard.

casey2 an hour ago

C-c C-v existed well before 2022. Most of history until the Renaissances consisted of the bulk of scholars copying out of "the book" whatever the book happened to be (Euclid, The Bible, Aristotle’s Logica Vetus, Cicero's Orations and De Officiis, 四書五經, 史記, 文選)

The liberal concept that the everyman should have their own original thoughts that others should consider is a historically a very new concept. And we start getting things that look a lot like C-c C-v quickly after the Renaissance.

See humans have the tendency to romanticize the past, and if this is allowed to compound they elevate really quite dismal people to the realm of literal godhood in some cases. If you asked someone a thousand years ago what they though life was like thousands of years in the past and what it will be like thousands of years in the future most would have said the past was better in all regards including health, strength, morals even technology; while the future would be viewed as the continual circling of the drain. Put yourself in their shoes, you go look at a Roman Colosseum, you can't build that, nobody you know can build that. If you asked Vitruvius during the construction of the Aqueducts he would tell you that he's maintain the knowledge of his ancestors, whom could have build such structures if they needed them or had the manpower, and the technical problems are just a trifle. If you pushed him, he might invoke Providentia and that if the gods stopped blessing you we'd fall even faster.

This kind of discovered then lost fits better narrative within the human psyche better than the unintuative truth is a constructed social conversation, that can be semi-formal and rigorous (the scientific method) or lax (common sense) depending on the setting.

ares623 an hour ago

Same with open source projects (or other software projects in general).

Pre-2022, when someone posts a Show HN, even if it's not something you would normally be interested in, there's a baseline understanding that _someone_ cared enough to spend time and effort to build it. So in a hypothetical future scenario if you do find yourself looking for that particular tool, there was value in you seeing that Show HN so you can revisit it.

Now, I just ignore all Show HN posts.

viccis an hour ago

api 2 hours ago

I honestly find this a little deranged. I’ve read some AI generated prose before and it’s… boring. It tends to be the mathematical average of all stories, with plots that are heavy on cliche and tropes played straight. If I read a book like that it’s probably just going to be a bad book that I don’t finish. Humans write lots of boring bad books too.

Eventually artists will figure out how to use AI to make real art that is actually good, just like photographers did with photography, and that will be its own new thing. I don’t see much of that yet but with photography it took a while.

[-]

bonoboTP an hour ago

Controversial, but I think photography is still nowhere close in artistic value to paintings. Yes, I've seen the award winning ones etc. Not impressed. It's fine I guess, but not more. Same with laptop music vs instrument music even before AI.

adamddev1 an hour ago

We can't say "just like to photographers did with photography" or "just like programmers did with higher-level languages." These developments are not analogous to LLMs. The jump into probabilistic text-guessing machines a fundamentally different thing.

actionfromafar 2 hours ago

Sure, but AI is about more than the arts. It's about high fructose corn syrup slop everywhere.

[-]

api an hour ago

That predates AI and has more to do with the incentives baked into media, especially social media that’s all about “time on app” and “time on site” and therefore infinite scroll brain rot.

[-]

actionfromafar an hour ago

It does predate AI. AI makes it much faster though and can close the rot-loop.

[-]

api an hour ago

Get off social media. It’s trash. It was trash before AI and it’s trash now.

[-]

bonoboTP an hour ago

HN is social media. And yes, let's get off it too. It's true.

phendrenad2 an hour ago

Maybe we'll return to curation and talent scouts. Like the pre-internet days of music: Everyone had a demo tape, but nobody wanted to listen to hundres of tapes of slop.

[-]

ghaff an hour ago

Well, with writing it's more editors and lunches. That's how I got a book contract.