> While I’m certain that this technology is producing some productivity improvements, I’m still genuinely (and frustratingly) unsure just how much of an improvement it is actually creating.
I often wonder how much more productive I'd be if just a fraction the effort and money poured into LLMs was spent on better API documentation and conventional coding tools. A lot of the time, I'm resorting to using an AI because I can't get information on how the current API of some-thing works into my brain fast enough, because the docs are non existent, outdated, or scattered and hard to collate.
This is facts. All of this talk about putting agent skills directly into repos (as Markdown!) is maddening. "Where were LITERALLY ALL OF YOU whenever the topic of docs as code came up?"
This is doubly maddening with NotebookLMs. They are becoming single sources of knowledge for large domains, which is great (except you can't just read the sources, which is very "We will read the Bible to you" energy), but, in the past, this knowledge would've been all over SharePoint, Slack, Google Drive, Confluence, etc.
> I often wonder how much more productive I'd be if just a fraction the effort and money poured into LLMs was spent on better API documentation and conventional coding tools.
Probably negligible. It's not a problem you can solve by pouring more money in. Evidence: configuration file format. I've never seen programmers who enjoy writing YAML. And pure JSON (without comments) is simply not a format should be written by humans. But as far as I know even in the richest companies these formats are still common. And the bad thing they were supposed to replace, XML config, was popularized by rich companies too...!
I feel like Google search results have gotten tremendously worse over the past 2 years too. It's almost like you have to use AI search to find anything useful now.
Which of course reduces traffic to sites and thus the incentives to create the content you're looking for in the first place :(
There’s many groups that “win” by making search results worse. It’s an ongoing battle between them, and if someone’s blaming solely Google for it, they’re way oversimplifying.
As someone who does broad activities, it supercharges a lot of things. Having a critical eye is required though. I estimate 40%-60% improvements on basic coding tasks.
Yeah I get this impression too. AI feels like it's papering over overwrought and badly designed frameworks, tech stacks with far too many things in them, and also the decline of people creating or advocating for really expressive languages.
Pragmatic sure, but we're building a tower of chairs here rather than building ladders like a real engineering field.
Good is debatable. The docs I want point out the weird shit in the system. The AI docs I've read are all basically "the get user endpoint can be called with HTTP to get a user, given a valid auth token". Thanks, it would have been faster to read the code.
> To what degree did I expand scope because I knew I could do more using the AI?
Someone at work recently termed this “Claude Creep”. It’s so easy to generate things push you towards going further but the reality is that’s you’re setting yourself up for more and more work to get them over the line.
If you’re an employee who can finish their work 25% faster but you’re not getting a 4-day work week, what are the incentives for not introducing creep?
Just sitting around and thinking of solutions to your own problems beats giving yourself work. As refactor_nietzsche would put it, resource slack is what lets you be a refactor_master instead of a refactor_slave. If you feel pressured into self-imposed creep, it's probably because you've internalized the idea that having too much slack makes you look dangerous to your superiors, so you default to playing the worker bee.
the flip side of claude creep is that the easy parts are now genuinely free, which means all your time goes to the 30% that was already hard. ai doesn't save you time on the hard bits, it just eliminates the excuse to not have done the easy bits first.what's helped: think in postconditions, not tasks. instead of 'add feature X', define 'the tests pass and the user can do Y'. the agent figures out what X means. without that anchor there's nothing to mark as done, so scope drifts indefinitely.
100%
Over the years I've amassed hundreds of code boilerplate snippets/templates that I would copy and paste and the modify, and now they're all just sitting in Obsidian gathering dust.
Why would I waste my time copying and pasting when I can just have Claude generate me basic ansible playbooks on the spot in 30 seconds.
Cognitive overload? For me, it’s easier to construct a mental model of the thing than have a full (sometime complex) example which may not necessarily be valid and/or on point. And as I’ve been exploring some foundational ideas of computing, you can do away with a lot of complexity in modern development.
It's a matter of whether you are just writing more regular quality things, or whether you are improving the quality of what you write. There's many things that increase quality, but are time consuming, which Claude Code can do for you.
One thing I recently did was run a pass over some unit test and functional test suites, asking for standardization on initialization, and creating reasonable helper methods to minimize boilerplate. Any dev can do that, if they have a week, and it'll future code changes more pleasant later. For Claude, an hour was a -8000 line PR that kept all the tests, with all the assertions.
It's what people need to figure out out of a a codebase. Our normal quality practices have an embedded max safe speed for changes without losing stability. If you use LLMs to try to change things faster, the quality practices have to improve if one wants to keep the number of issues per week constant. Whether it's improving testing, or sending the LLM to look at logs and find the bugs faster, one needs to increase the quality budget.
Some of the expanded scope that I’ve done almost for free is usually around UX polish and accessibility. I even completely redid the —help for a few CLI tools I have when I would never have invested over an hour on each before agents.
I agree that the efficiency and quality are very hard to measure. I’m extremely confident that when used well, agents are a huge gain for both though. When used poorly, it is just slop you can make really fast.
Dude. I’ve been thinking about this a lot! I think it’s because the traditional way we internalize the costs of what we are building just got take for a ride. We don’t really (or I don’t anyway) fully know what “too much scope” feels like with one of these Claude thingies. So it’s easy to completely both overestimate complexity and underestimate it too. Some times the LLM makes a seemingly daunting refactor be super simple and sometimes something seemingly not complex can take it forever… and there really is, for me, a good “gut sense” of how something will go.
So lately I’ve just decided that I’ll time box things instead of set defined endpoints. And by “endpoint” I really mean “I’m done for the day” and honestly maybe thinking about it… “I’m done with this project”.
I don’t know. But the term “Claude Creep” is absolutely something I can identify with. That thing will take you down a rathole that started with just pulling in some document and ends with you completely repartitioning your file system. lol.
The biggest positive I have seen is not so much in the new tools, but in new ways to convince the higher ups to do sensible things.
We always find that small teams of locals can do much much more than a team with an unlimited number of low cost "developers". Not just because the competence of low cost devs is poor, but also the structure of how you work changes for the worse with a bigger team, for the worse with a distributed team, and for the worse with a skill-diverse team.
Thats before you get into the cultural flaws of favored destinations like India.
So we have been able to argue things like add one local + ai is better than about 20-100 Indians, depending on role and business structure needed to manage low-competence low-trust "developers". So we are planning to completely on-shore in the near future.
The bean counters are happy, and the quality of the work is improving.
> I’ve had the idea that from a social perspective it’d be regarded like plastic surgery, in that it only looks weird when its over-done, or done badly.
An important aspect of comparison is that nobody is going to tell you that your surgery is noticeable or looks bad.
Your friends, family, partners, coworkers, aren't going to say anything, neither are people you meet casually, certainly not service workers, strangers aren't going to pull you aside to tell you the truth about your nose job, etc.
I hope the same social taboo doesn't transfer over to AI content. We should honestly critique AI generated content, used either in-whole or in-part with human creations. If the inclusion of AI content botched your article, saying so should be socially acceptable.
We saw some of this here on HN. It used to be that when AI content would be submitted here, it was a social faux pas to even mention it was LLM generated, same thing with LLM generated comments, no matter how obvious it was. Mentioning a comment was AI was socially verboten and you'd be finger-wagged at.
Eventually, AI fatigue caused the community to discount Show HN entries, submissions and comments, and the signal to noise ratio could no longer be ignored.
Now, turn on showdead. Those same comments, that users were expected to interact with as if they were made in good faith by real people, litter every submission's comment section. These comments objectively hurt discussion and it's a good thing they're shadowbanned.
Culturally, I hope we can reach a point where critique of AI content, including code, doesn't brand critics as haters, Luddites, or worse, and stifle conversation about what our communities really value and want.
> Now, turn on showdead. Those same comments, that users were expected to interact with as if they were made in good faith by real people, litter every submission's comment section.
One big issue I've found is that HN seems to automatically comments from all new users, no matter the content. I used to try to change handles every so often because HN doesn't allow people to delete their comments after the first hour, which becomes a bigger and bigger privacy issue over time (and frankly, extremely hostile to users). Especially for those of use who don't use AI, our individual writing styles are likely identifiable over a long enough period of time.
But the last few times I tried it, all of my comments were immediately shadowbanned. No notification or any indication on the new account, but if I checked with an older account, the comments were all "dead." I try to put effort into my comments, reading through the entirety of the comment I'm replying to (often multiple times), proofreading them myself (I never use AI), and linking to any claims I'm making. All of this takes considerable time. It's extremely frustrating to put that kind of effort into a comment and have it autobanned. It's even more frustrating when the system deceives you and makes you believe it's been posted, and you have to check with another account to learn that it was actually set to dead.
Supposedly there's a desire for comments that people put effort into and aren't written by AI. But why would new users bother putting in that work when their comments get automatically and secretly killed, without them having any way of knowing?
I'm starting to think that the best solution is to move away from these types of online communities in general.
It's the same way with writing as with video. There are some videos now where it's actually hard to tell. You can only tell it's AI when it's bad. When it's done well, you don't even know it's AI.
So it creates this selection effect where people only associate AI with fake and bad. The good stuff, they don't associate with AI at all.
But there is also the case where you see polished apps but are ai generated.
It's like those ai websites they look "sleek" but all look the same, versus a crappy same that it doesn't look as pretty but looks very human. I don't know quite how to put it
It's funny you mention that. The only difference is sometimes you need a functionality without doing the plumbing. At the end of the day if you're getting the output you need, the process doesn't matter. It's an interesting analogy but only works if the inspector is another expert dev.
When I have such moment and I take a step back, there’s usually a strong hint that there’s a meta problem behind those instances. And while you have to chose when to take the time to solve such problem, it’s usually worth it.
I would agree with the utility of Claude and Claude Code. Claude feels like your own executive assistant, sales team and IT department. Combine that with Claude Code and you can build some incredible things. Myself as an example, I used Claude to advise me on starting a business and building a MVP. After a few weeks of refinement I was able to create something I never could have done without Claude. It is a game changer for sure.
Several of my friends who don't know any programming are creating video games and music software with AI agents.
Much of what they are doing is incomprehensible to me. I often find that being a programmer is actually holding me back in this regard, because I feel the need to understand everything the code is doing, as well as the specialized knowledge (e.g. the math involved in audio processing and sound effects). Whereas my friends can just say... yeah add a phaser effect to the synth and it just does it.
Have they shipped anything that people are using? The concerns are different and creating something usable by people is why software engineering exists.
I think that’s where software engineering is not quite getting what’s happening right now. People keep asking where are the apps?where all this great code? And the answer is becoming that people aren’t building apps to sell to other people. They’re building the apps that they themselves want to use. I’ve made dozens of apps that I have no interest in distributing or using outside of friends and family. The AI coding revolution is already here and it’s not in production software so much as it is in bespoke small group applications.
I'm an early-mid career SDET/SRE. Currently building a full web app with rest AP integrations, looks great and works great. Lots of functionality, useful and being hardened all b/c of AI. It's going to be live with customers soon I hope.
AI is not a feature of the product. GTM will be interesting, have some good ideas.
It's really up to you to be clever. I've never used Js/Ts/node/these apis etc. I started programming as a non-cs engineer to automate stuff and then got into SWE. This is truly an amazing time.
If it was any good at sales I'm pretty sure a company I did a contract for would be thriving by now. Instead they have a product that is ~500 times faster than the competition, with better UX for the most common activity in that field and much better built-in analysis tools for end results, run in real-time (which competitor software cannot do). Sure, it's not a massive market in terms of demographics, but I'd expect real sales people to succeed with what they have. Something very real has gone wrong with sales and it's not something they've been able to solve using LLMs.
I know this company uses LLMs, because I'm working on another project for them where one of the co-founders is relentlessly spamming the repo with overwrought Claude Code output like there is no tomorrow. This shit sucks at code generation and it most likely sucks at everything else too, except people often assume it's better at things they don't know about.
> (The) Output was coherent but its ‘style’ was very boring and overtly inoffensive, which was (and still is) a clear limitation of the technology.
The style isn’t a limit of the technology, it’s a limit of the lobotomized models from OpenAI and Anthropic. The open source community has lots of models that are great at creative writing.
Generating AI Content sucks, Consuming AI Content sucks, but combine them in the same loop and it's really addicting. AI Content Prosuming rocks.
Since LLMs, if I see a video I think is interesting, I take the transcript, feed it into an LLM, I summarize it and ask it a couple of questions.
I've turned 12 minute videos back into the 5 phrases news it was based on.
I suppose that when you're the one generating the request, it feels more personal. It is also very interesting that most LLMs respond like a normal person when you talk to them directly, but suddenly adopt the more annoying blogger speech patterns when you tell them 'create content'.
> I've turned 12 minute videos back into the 5 phrases news it was based on.
Why not read the original news?
Okay, there are many reasons why you might not want to do that, such as ads, tracking, having to pay for a subscription if you only want one article, and just plain boredom. I wasn't trying to call you out, it was more of a question for society at large.
Why has it become more appealing to have a "content creator" turn 5 phrases of news into a 12 minute video and then have an LLM convert it back, rather than reading the 5 phrases?
It's not that it's appealing. For example, I wanted to learn how to bend notes on harmonica, but it wasn't working. That's not something you can really understand without video, yet most tutorials are 5-15 minutes long and only show the actual technique at some random point in ~30 seconds (just search 'how to bend on harmonica' and see). So I take the transcript t check whether it's a method I've already tried or something new worth watching, and I also get an extra explainer of the technique in text.
Also, with videos like "what X said about situation Y in discourse Z". Sometimes you're just curious, and you can't realistically extract that efficiently from a full one-hour speech on a geolocked, untranscribed mass-media website, so it's easier to summarize the transcript of the 12 min video directly.
As for why everything is 12 minutes long, it's most likely because content creation isn't optimized to teach you anything or be useful, it's optimized to maximize watch time so platforms can serve more ads to you. The pattern is: I got you intrigued in something; you want the answer? pay me your time.
the 'claude creep' framing is real but there's a flip side worth naming. the synchronous pair programming case - AI as a faster version of you - is what most productivity debates focus on. the more interesting shift is genuinely async delegation: you define the task, set acceptance criteria, kick it off, come back to results. that's a different relationship to the tool entirely, and it forces you to get better at specifying what 'done' looks like upfront.that's actually a good forcing function. most productivity loss from AI-assisted work comes from underspecifying the task, not from the AI being bad.
The section about being "glazed" into action resonates. Hidden within this concept I think is something profound about human motivation, innuendo and all.
> AI generated prose is at best boring, and at worst genuinely unappealing. I’m continually tempted, because in theory it should work well. The AI has perfect spelling and grammar, has more than enough context to produce article-length content, and can do in seconds what takes me hours.
I have a thesis in mind...that there is something fundamental to the human spirit that relishes a sort of friction that LLMs cannot observe or reproduce on their own.
> I remember the first time I vibe-coded a small project. It was an app that generated placeholder cards for my MTG collection. I prompted the bot (now Claude, not ChatGPT).....
I would be interested what date this was? I am surprised if it's been recent that Claude didn't 1 shot this.
The Gartner hype cycle has 5 phases: tech trigger (6 months - 2 years), peak of inflated expectations (6 months - 2 years ), the slope of enlightenment (2 - 5 years), and the plateau of productivity (5+ years), and the slope of decline (Obsolescence which noone talks about). If we are in fact at the 40th month then we are either approaching the peak of inflated expectations, the slope of enlightenment, or the plateau of productivity. I would say we are probably approaching the peak of inflated expectations. We are constantly hearing the symptoms of the 'This Time is Different' Syndrome from people saying the old rules don’t apply which is the classic sign the peak is approaching. The average financial bubble bursts after 3 years, however the dot-com bubble burst 5 years after peak and the housing bubble took 3-4 years. We are probably in the “bubble mania” phase right now because of all the irrational exuberance. Ride the Lightning!
A big part of the benefit of AI has nothing to do with AI and everything to do with leading point haired bosses around. They won't approve needed refactorings but promise to integrate AI and suddenly budget is no problem, just add an easily removable chatbot afterward and you're golden.
The stupid thing is that instead of using AI to give ourselves 1 hour work days, we’re just cramming more work into the same amount of time we’ve always worked.
Yeah, I think this is always the case. We get more powerful tools that do the same thing in 5th amount of time, now we are asked to do 5x more. Capitlalism.
Do you regularly find text content that you know is AI written (but is not marked as such)? Because honestly I don't, and it must exist in decent quantity by now. Or perhaps it's still sparse?
Have a look here [1] and here [2] - I think they are good resources, but fallible in the long run. I think yes, I do, often confirmed by communication with people I know (i.e. i suspect they have used AI to make something -> I ask). This falls victim to confirmation bias, though. I suspect a nontrivial amount of writing I read is AI generated without me realising, and I'm wary also of falsely flagging AI-generated content that is actually from humans.
- Other source-to-text integrity issues; for example, the WWF source says very little about Malaysia specifically, only mentions Sunda tigers (Panthera tigris sondaica), and does not mention tapirs at all
- Very short yet consistent paragraph length
- Generic "see also" links, one of which is redlinked
This is not the sort of thing that I pay attention to unless I'm doing detailed research. And even then I'd probably have a bot check these for me, ironically, since it's such a mechanical job. At the very least detecting AI like this requires conscious effort.
I think the second resource that you linked to is valuable. The first is useless unless you're a Wikipedia editor, the significance of verifying citations not withstanding.
The gap between LLM-generated writing and the composite style of the average Wikipedia page is more narrow than most people may believe.
You will start to recognize it over time. The major AI models each have their own voice and patterns that they overuse.
The more you see those patterns the more you start recognizing them. By now I can recognize quickly if a blog post or README.md was generated by Claude or ChatGPT because the signs are so obvious.
Even Hacker News comments that are AI written are easy to spot if they weren't edited. I know I'm not alone because when I recognize an AI comment I check their comment history and find other people calling out their AI-generated submissions, too.
Learning how to recognize the output of the popular AI models is becoming a critical business skill, too. You need to be able to separate out the content from someone who was doing real work that you should take seriously as opposed to the output of someone who is having ChatGPT produce volumes of text that they don't review. The people who do that will waste your time.
I don't see how to interpret your claims. How do you yourself know that you're right when you "recognize" Claude or ChatGPT? How do you know how much of the text you don't recognize as any LLM is actually LLM-generated? My recollection is whenever I've seen data on this--the educators who think they can spot students cheating--the conclusion is people are really bad at identifying LLM-generated content.
It’s very obvious if you leave the default tone. If you specifically ask it to hide its ai voice and make it appear human, it does a really good job. Even better if you give it an example of the writing style.
Ask it to write in the style of patio11 or someone else with a distinctive tone, and it will do a remarkable job.
It will pass pretty consistently. Not sure I love it.
This is a temporary problem. Look at how fast things are progressing. Things will improve until none of this matters because the output is indistinguishable.
Yes, often, and often here on HN or Substack if I point it out, it doesn't lead to anything good. Many don't recognize it, many do, the author gets defensive etc.
This article doesn't have the tells, it looks human written.
I found that many people don't have a radar for this. They may know about delve, emdashes, tapestry, multifaceted or "not just X but y" and if these are not there they don't see it.
Bro but... you now are having a business is planned by a paid chatbot, they can shutdown anytime or make it more expensive, also it is imposiable to get something new, you are copying for somewhere else, maybe what claude is copying is having a copyrights on it, like a leaked code and etc, also your brain will slowly shutdown from thinking about 'business' so you will hevaly relays on claude in the future :)
My friend is trying to do the same, the Docker stack he made for his SaaS is really amazing, it is following the standards from the ancient age.
I suspect you'll (a small-medium business) be able to buy a Claude 4.6-class rack mount device for $6000 by 2030 that does 100 t/s with 1 million token context, which honestly, is probably adequate for an office (front office, back office, executive tier etc) of 10-300 unless you've got more than 4 engineers on staff. That kind of offline device is going to push everyone to provide that kind of cloud-enabled baseline service at very low cost. The Qwen 3.5 series is already showing you can almost (but not quite) squeeze that kind of performance out of consumer hardware. 256/512gb consumer video cards will get us there, eventually, if capacity ever catches up with demand.
You would need to go back to McCullough, Pitts and von Neumann in the 1940s if you wanted to talk about where it really got started.
The 1970s were a long, dark AI winter, thanks to FUD spread by Minsky and Papert. A lot of recent work could have been done back then despite the lack of good hardware, as seen in the other HN story where the guy trained a transformer on a PDP-11. But the whole field was radioactive after their book came out.
> While I’m certain that this technology is producing some productivity improvements, I’m still genuinely (and frustratingly) unsure just how much of an improvement it is actually creating.
I often wonder how much more productive I'd be if just a fraction the effort and money poured into LLMs was spent on better API documentation and conventional coding tools. A lot of the time, I'm resorting to using an AI because I can't get information on how the current API of some-thing works into my brain fast enough, because the docs are non existent, outdated, or scattered and hard to collate.
This is facts. All of this talk about putting agent skills directly into repos (as Markdown!) is maddening. "Where were LITERALLY ALL OF YOU whenever the topic of docs as code came up?"
This is doubly maddening with NotebookLMs. They are becoming single sources of knowledge for large domains, which is great (except you can't just read the sources, which is very "We will read the Bible to you" energy), but, in the past, this knowledge would've been all over SharePoint, Slack, Google Drive, Confluence, etc.
> I often wonder how much more productive I'd be if just a fraction the effort and money poured into LLMs was spent on better API documentation and conventional coding tools.
Probably negligible. It's not a problem you can solve by pouring more money in. Evidence: configuration file format. I've never seen programmers who enjoy writing YAML. And pure JSON (without comments) is simply not a format should be written by humans. But as far as I know even in the richest companies these formats are still common. And the bad thing they were supposed to replace, XML config, was popularized by rich companies too...!
JSON is not designed as a configuration file format.
I feel like Google search results have gotten tremendously worse over the past 2 years too. It's almost like you have to use AI search to find anything useful now.
Which of course reduces traffic to sites and thus the incentives to create the content you're looking for in the first place :(
There’s many groups that “win” by making search results worse. It’s an ongoing battle between them, and if someone’s blaming solely Google for it, they’re way oversimplifying.
My favorite thing is when some projects now have better documentation in their Claude skills or MCPs than they ever did for users.
There is natural incentive for engineers working on a project to keep Claude skills up to date. I cannot say the same for general documentation.
But that documentation itself is likely AI-generated
At least it saves me from having to generate the docs myself!
Why continue involvement with a project that clearly devalues their “customers” or “users” who care about documentation?
Projects that spend time on documentation for my robots have shown me they care about my use case!
As someone who does broad activities, it supercharges a lot of things. Having a critical eye is required though. I estimate 40%-60% improvements on basic coding tasks.
I don't bring huge codebases to it.
Yeah I get this impression too. AI feels like it's papering over overwrought and badly designed frameworks, tech stacks with far too many things in them, and also the decline of people creating or advocating for really expressive languages.
Pragmatic sure, but we're building a tower of chairs here rather than building ladders like a real engineering field.
then you should be delighted we have LLMs one of the use cases they are best suited to is writing documentation, much better than humans can.
Good is debatable. The docs I want point out the weird shit in the system. The AI docs I've read are all basically "the get user endpoint can be called with HTTP to get a user, given a valid auth token". Thanks, it would have been faster to read the code.
They write good _looking_ documentation. How good those docs are is entirely on the person/people who prompted them into existence.
Please don't inflict LLM docs on people
[dead]
> To what degree did I expand scope because I knew I could do more using the AI?
Someone at work recently termed this “Claude Creep”. It’s so easy to generate things push you towards going further but the reality is that’s you’re setting yourself up for more and more work to get them over the line.
If you’re an employee who can finish their work 25% faster but you’re not getting a 4-day work week, what are the incentives for not introducing creep?
Just sitting around and thinking of solutions to your own problems beats giving yourself work. As refactor_nietzsche would put it, resource slack is what lets you be a refactor_master instead of a refactor_slave. If you feel pressured into self-imposed creep, it's probably because you've internalized the idea that having too much slack makes you look dangerous to your superiors, so you default to playing the worker bee.
the flip side of claude creep is that the easy parts are now genuinely free, which means all your time goes to the 30% that was already hard. ai doesn't save you time on the hard bits, it just eliminates the excuse to not have done the easy bits first.what's helped: think in postconditions, not tasks. instead of 'add feature X', define 'the tests pass and the user can do Y'. the agent figures out what X means. without that anchor there's nothing to mark as done, so scope drifts indefinitely.
100% Over the years I've amassed hundreds of code boilerplate snippets/templates that I would copy and paste and the modify, and now they're all just sitting in Obsidian gathering dust. Why would I waste my time copying and pasting when I can just have Claude generate me basic ansible playbooks on the spot in 30 seconds.
An idea is to have the AI ingest your templates, it might be useful
Cognitive overload? For me, it’s easier to construct a mental model of the thing than have a full (sometime complex) example which may not necessarily be valid and/or on point. And as I’ve been exploring some foundational ideas of computing, you can do away with a lot of complexity in modern development.
basically readme driven development at that point.
It's a matter of whether you are just writing more regular quality things, or whether you are improving the quality of what you write. There's many things that increase quality, but are time consuming, which Claude Code can do for you.
One thing I recently did was run a pass over some unit test and functional test suites, asking for standardization on initialization, and creating reasonable helper methods to minimize boilerplate. Any dev can do that, if they have a week, and it'll future code changes more pleasant later. For Claude, an hour was a -8000 line PR that kept all the tests, with all the assertions.
It's what people need to figure out out of a a codebase. Our normal quality practices have an embedded max safe speed for changes without losing stability. If you use LLMs to try to change things faster, the quality practices have to improve if one wants to keep the number of issues per week constant. Whether it's improving testing, or sending the LLM to look at logs and find the bugs faster, one needs to increase the quality budget.
Some of the expanded scope that I’ve done almost for free is usually around UX polish and accessibility. I even completely redid the —help for a few CLI tools I have when I would never have invested over an hour on each before agents.
I agree that the efficiency and quality are very hard to measure. I’m extremely confident that when used well, agents are a huge gain for both though. When used poorly, it is just slop you can make really fast.
Dude. I’ve been thinking about this a lot! I think it’s because the traditional way we internalize the costs of what we are building just got take for a ride. We don’t really (or I don’t anyway) fully know what “too much scope” feels like with one of these Claude thingies. So it’s easy to completely both overestimate complexity and underestimate it too. Some times the LLM makes a seemingly daunting refactor be super simple and sometimes something seemingly not complex can take it forever… and there really is, for me, a good “gut sense” of how something will go.
So lately I’ve just decided that I’ll time box things instead of set defined endpoints. And by “endpoint” I really mean “I’m done for the day” and honestly maybe thinking about it… “I’m done with this project”.
I don’t know. But the term “Claude Creep” is absolutely something I can identify with. That thing will take you down a rathole that started with just pulling in some document and ends with you completely repartitioning your file system. lol.
And just like that, a new term has been coined.
The biggest positive I have seen is not so much in the new tools, but in new ways to convince the higher ups to do sensible things.
We always find that small teams of locals can do much much more than a team with an unlimited number of low cost "developers". Not just because the competence of low cost devs is poor, but also the structure of how you work changes for the worse with a bigger team, for the worse with a distributed team, and for the worse with a skill-diverse team.
Thats before you get into the cultural flaws of favored destinations like India.
So we have been able to argue things like add one local + ai is better than about 20-100 Indians, depending on role and business structure needed to manage low-competence low-trust "developers". So we are planning to completely on-shore in the near future.
The bean counters are happy, and the quality of the work is improving.
Nice observation about AI-generated content:
> I’ve had the idea that from a social perspective it’d be regarded like plastic surgery, in that it only looks weird when its over-done, or done badly.
An important aspect of comparison is that nobody is going to tell you that your surgery is noticeable or looks bad.
Your friends, family, partners, coworkers, aren't going to say anything, neither are people you meet casually, certainly not service workers, strangers aren't going to pull you aside to tell you the truth about your nose job, etc.
I hope the same social taboo doesn't transfer over to AI content. We should honestly critique AI generated content, used either in-whole or in-part with human creations. If the inclusion of AI content botched your article, saying so should be socially acceptable.
We saw some of this here on HN. It used to be that when AI content would be submitted here, it was a social faux pas to even mention it was LLM generated, same thing with LLM generated comments, no matter how obvious it was. Mentioning a comment was AI was socially verboten and you'd be finger-wagged at.
Eventually, AI fatigue caused the community to discount Show HN entries, submissions and comments, and the signal to noise ratio could no longer be ignored.
Now, turn on showdead. Those same comments, that users were expected to interact with as if they were made in good faith by real people, litter every submission's comment section. These comments objectively hurt discussion and it's a good thing they're shadowbanned.
Culturally, I hope we can reach a point where critique of AI content, including code, doesn't brand critics as haters, Luddites, or worse, and stifle conversation about what our communities really value and want.
> Now, turn on showdead. Those same comments, that users were expected to interact with as if they were made in good faith by real people, litter every submission's comment section.
One big issue I've found is that HN seems to automatically comments from all new users, no matter the content. I used to try to change handles every so often because HN doesn't allow people to delete their comments after the first hour, which becomes a bigger and bigger privacy issue over time (and frankly, extremely hostile to users). Especially for those of use who don't use AI, our individual writing styles are likely identifiable over a long enough period of time.
But the last few times I tried it, all of my comments were immediately shadowbanned. No notification or any indication on the new account, but if I checked with an older account, the comments were all "dead." I try to put effort into my comments, reading through the entirety of the comment I'm replying to (often multiple times), proofreading them myself (I never use AI), and linking to any claims I'm making. All of this takes considerable time. It's extremely frustrating to put that kind of effort into a comment and have it autobanned. It's even more frustrating when the system deceives you and makes you believe it's been posted, and you have to check with another account to learn that it was actually set to dead.
Supposedly there's a desire for comments that people put effort into and aren't written by AI. But why would new users bother putting in that work when their comments get automatically and secretly killed, without them having any way of knowing?
I'm starting to think that the best solution is to move away from these types of online communities in general.
Your friends, family, partners, coworkers, aren't going to say anything about your natural appearance either. Unless they're super rude.
People will tell you if you're good looking
It's the same way with writing as with video. There are some videos now where it's actually hard to tell. You can only tell it's AI when it's bad. When it's done well, you don't even know it's AI.
So it creates this selection effect where people only associate AI with fake and bad. The good stuff, they don't associate with AI at all.
But there is also the case where you see polished apps but are ai generated. It's like those ai websites they look "sleek" but all look the same, versus a crappy same that it doesn't look as pretty but looks very human. I don't know quite how to put it
It's funny you mention that. The only difference is sometimes you need a functionality without doing the plumbing. At the end of the day if you're getting the output you need, the process doesn't matter. It's an interesting analogy but only works if the inspector is another expert dev.
When I have such moment and I take a step back, there’s usually a strong hint that there’s a meta problem behind those instances. And while you have to chose when to take the time to solve such problem, it’s usually worth it.
I would agree with the utility of Claude and Claude Code. Claude feels like your own executive assistant, sales team and IT department. Combine that with Claude Code and you can build some incredible things. Myself as an example, I used Claude to advise me on starting a business and building a MVP. After a few weeks of refinement I was able to create something I never could have done without Claude. It is a game changer for sure.
Several of my friends who don't know any programming are creating video games and music software with AI agents.
Much of what they are doing is incomprehensible to me. I often find that being a programmer is actually holding me back in this regard, because I feel the need to understand everything the code is doing, as well as the specialized knowledge (e.g. the math involved in audio processing and sound effects). Whereas my friends can just say... yeah add a phaser effect to the synth and it just does it.
Have they shipped anything that people are using? The concerns are different and creating something usable by people is why software engineering exists.
I think that’s where software engineering is not quite getting what’s happening right now. People keep asking where are the apps?where all this great code? And the answer is becoming that people aren’t building apps to sell to other people. They’re building the apps that they themselves want to use. I’ve made dozens of apps that I have no interest in distributing or using outside of friends and family. The AI coding revolution is already here and it’s not in production software so much as it is in bespoke small group applications.
I'm an early-mid career SDET/SRE. Currently building a full web app with rest AP integrations, looks great and works great. Lots of functionality, useful and being hardened all b/c of AI. It's going to be live with customers soon I hope.
AI is not a feature of the product. GTM will be interesting, have some good ideas.
It's really up to you to be clever. I've never used Js/Ts/node/these apis etc. I started programming as a non-cs engineer to automate stuff and then got into SWE. This is truly an amazing time.
If it was any good at sales I'm pretty sure a company I did a contract for would be thriving by now. Instead they have a product that is ~500 times faster than the competition, with better UX for the most common activity in that field and much better built-in analysis tools for end results, run in real-time (which competitor software cannot do). Sure, it's not a massive market in terms of demographics, but I'd expect real sales people to succeed with what they have. Something very real has gone wrong with sales and it's not something they've been able to solve using LLMs.
I know this company uses LLMs, because I'm working on another project for them where one of the co-founders is relentlessly spamming the repo with overwrought Claude Code output like there is no tomorrow. This shit sucks at code generation and it most likely sucks at everything else too, except people often assume it's better at things they don't know about.
> (The) Output was coherent but its ‘style’ was very boring and overtly inoffensive, which was (and still is) a clear limitation of the technology.
The style isn’t a limit of the technology, it’s a limit of the lobotomized models from OpenAI and Anthropic. The open source community has lots of models that are great at creative writing.
Generating AI Content sucks, Consuming AI Content sucks, but combine them in the same loop and it's really addicting. AI Content Prosuming rocks.
Since LLMs, if I see a video I think is interesting, I take the transcript, feed it into an LLM, I summarize it and ask it a couple of questions. I've turned 12 minute videos back into the 5 phrases news it was based on. I suppose that when you're the one generating the request, it feels more personal. It is also very interesting that most LLMs respond like a normal person when you talk to them directly, but suddenly adopt the more annoying blogger speech patterns when you tell them 'create content'.
> I've turned 12 minute videos back into the 5 phrases news it was based on.
Why not read the original news?
Okay, there are many reasons why you might not want to do that, such as ads, tracking, having to pay for a subscription if you only want one article, and just plain boredom. I wasn't trying to call you out, it was more of a question for society at large.
Why has it become more appealing to have a "content creator" turn 5 phrases of news into a 12 minute video and then have an LLM convert it back, rather than reading the 5 phrases?
It's not that it's appealing. For example, I wanted to learn how to bend notes on harmonica, but it wasn't working. That's not something you can really understand without video, yet most tutorials are 5-15 minutes long and only show the actual technique at some random point in ~30 seconds (just search 'how to bend on harmonica' and see). So I take the transcript t check whether it's a method I've already tried or something new worth watching, and I also get an extra explainer of the technique in text.
Also, with videos like "what X said about situation Y in discourse Z". Sometimes you're just curious, and you can't realistically extract that efficiently from a full one-hour speech on a geolocked, untranscribed mass-media website, so it's easier to summarize the transcript of the 12 min video directly.
As for why everything is 12 minutes long, it's most likely because content creation isn't optimized to teach you anything or be useful, it's optimized to maximize watch time so platforms can serve more ads to you. The pattern is: I got you intrigued in something; you want the answer? pay me your time.
the 'claude creep' framing is real but there's a flip side worth naming. the synchronous pair programming case - AI as a faster version of you - is what most productivity debates focus on. the more interesting shift is genuinely async delegation: you define the task, set acceptance criteria, kick it off, come back to results. that's a different relationship to the tool entirely, and it forces you to get better at specifying what 'done' looks like upfront.that's actually a good forcing function. most productivity loss from AI-assisted work comes from underspecifying the task, not from the AI being bad.
This is a sound personal assessment.
The section about being "glazed" into action resonates. Hidden within this concept I think is something profound about human motivation, innuendo and all.
> AI generated prose is at best boring, and at worst genuinely unappealing. I’m continually tempted, because in theory it should work well. The AI has perfect spelling and grammar, has more than enough context to produce article-length content, and can do in seconds what takes me hours.
I have a thesis in mind...that there is something fundamental to the human spirit that relishes a sort of friction that LLMs cannot observe or reproduce on their own.
> that there is something fundamental to the human spirit that relishes a sort of friction that LLMs cannot observe or reproduce on their own.
> I remember the first time I vibe-coded a small project. It was an app that generated placeholder cards for my MTG collection. I prompted the bot (now Claude, not ChatGPT).....
I would be interested what date this was? I am surprised if it's been recent that Claude didn't 1 shot this.
A solution looking for a problem
Marketers present a list of potential problems
The smallest success stories are marketed as indicators of future success, but to verify this, one must wait patiently for the future to arrive
The Gartner hype cycle has 5 phases: tech trigger (6 months - 2 years), peak of inflated expectations (6 months - 2 years ), the slope of enlightenment (2 - 5 years), and the plateau of productivity (5+ years), and the slope of decline (Obsolescence which noone talks about). If we are in fact at the 40th month then we are either approaching the peak of inflated expectations, the slope of enlightenment, or the plateau of productivity. I would say we are probably approaching the peak of inflated expectations. We are constantly hearing the symptoms of the 'This Time is Different' Syndrome from people saying the old rules don’t apply which is the classic sign the peak is approaching. The average financial bubble bursts after 3 years, however the dot-com bubble burst 5 years after peak and the housing bubble took 3-4 years. We are probably in the “bubble mania” phase right now because of all the irrational exuberance. Ride the Lightning!
A big part of the benefit of AI has nothing to do with AI and everything to do with leading point haired bosses around. They won't approve needed refactorings but promise to integrate AI and suddenly budget is no problem, just add an easily removable chatbot afterward and you're golden.
I think we'll find that for most AI stuff.
The stupid thing is that instead of using AI to give ourselves 1 hour work days, we’re just cramming more work into the same amount of time we’ve always worked.
Yeah, I think this is always the case. We get more powerful tools that do the same thing in 5th amount of time, now we are asked to do 5x more. Capitlalism.
*LLM
Do you regularly find text content that you know is AI written (but is not marked as such)? Because honestly I don't, and it must exist in decent quantity by now. Or perhaps it's still sparse?
Have a look here [1] and here [2] - I think they are good resources, but fallible in the long run. I think yes, I do, often confirmed by communication with people I know (i.e. i suspect they have used AI to make something -> I ask). This falls victim to confirmation bias, though. I suspect a nontrivial amount of writing I read is AI generated without me realising, and I'm wary also of falsely flagging AI-generated content that is actually from humans.
[1] https://en.wikipedia.org/wiki/Wikipedia%3AAI_or_not_quiz [2] https://en.wikipedia.org/wiki/Wikipedia%3ASigns_of_AI_writin...
Okay, but the answers in [1] look something like:
AI generated. Some of the clues include:
- Most obviously, a failed ISBN checksum
- Other source-to-text integrity issues; for example, the WWF source says very little about Malaysia specifically, only mentions Sunda tigers (Panthera tigris sondaica), and does not mention tapirs at all
- Very short yet consistent paragraph length
- Generic "see also" links, one of which is redlinked
This is not the sort of thing that I pay attention to unless I'm doing detailed research. And even then I'd probably have a bot check these for me, ironically, since it's such a mechanical job. At the very least detecting AI like this requires conscious effort.
Ok, but like, what about [2]?
I can easily tell AI writing. I'm sure plenty goes under the radar, but I can still catch a lot.
I think the second resource that you linked to is valuable. The first is useless unless you're a Wikipedia editor, the significance of verifying citations not withstanding.
The gap between LLM-generated writing and the composite style of the average Wikipedia page is more narrow than most people may believe.
Yes, here, reddit, X, at work in people's emails and status reports.
You will start to recognize it over time. The major AI models each have their own voice and patterns that they overuse.
The more you see those patterns the more you start recognizing them. By now I can recognize quickly if a blog post or README.md was generated by Claude or ChatGPT because the signs are so obvious.
Even Hacker News comments that are AI written are easy to spot if they weren't edited. I know I'm not alone because when I recognize an AI comment I check their comment history and find other people calling out their AI-generated submissions, too.
Learning how to recognize the output of the popular AI models is becoming a critical business skill, too. You need to be able to separate out the content from someone who was doing real work that you should take seriously as opposed to the output of someone who is having ChatGPT produce volumes of text that they don't review. The people who do that will waste your time.
I don't see how to interpret your claims. How do you yourself know that you're right when you "recognize" Claude or ChatGPT? How do you know how much of the text you don't recognize as any LLM is actually LLM-generated? My recollection is whenever I've seen data on this--the educators who think they can spot students cheating--the conclusion is people are really bad at identifying LLM-generated content.
It’s very obvious if you leave the default tone. If you specifically ask it to hide its ai voice and make it appear human, it does a really good job. Even better if you give it an example of the writing style.
Ask it to write in the style of patio11 or someone else with a distinctive tone, and it will do a remarkable job.
It will pass pretty consistently. Not sure I love it.
This is a temporary problem. Look at how fast things are progressing. Things will improve until none of this matters because the output is indistinguishable.
I wish I could be this confident about the future.
Yes, often, and often here on HN or Substack if I point it out, it doesn't lead to anything good. Many don't recognize it, many do, the author gets defensive etc.
This article doesn't have the tells, it looks human written.
Literally every day from green accounts on Hacker News, and in many, many TFAs.
My comment history is months of me pointing it out about articles here. You're just not noticing it, it's everywhere and is extremely obvious to me.
It's possible I should envy you, I'm not sure.
I see it all the time in basically every form of text communication. What makes you think you are not seeing it?
I found that many people don't have a radar for this. They may know about delve, emdashes, tapestry, multifaceted or "not just X but y" and if these are not there they don't see it.
There's at least two comments in this submission from green accounts if you enable showdead.
All the time, especially on LinkedIn.
Yes, all the time.
HN and YouTube are the worst offenders for me.
I'm pretty sure this was written or heavily edited by an llm.
https://www.seriouseats.com/eggplant-grilling-tips-11759622
Bro but... you now are having a business is planned by a paid chatbot, they can shutdown anytime or make it more expensive, also it is imposiable to get something new, you are copying for somewhere else, maybe what claude is copying is having a copyrights on it, like a leaked code and etc, also your brain will slowly shutdown from thinking about 'business' so you will hevaly relays on claude in the future :)
My friend is trying to do the same, the Docker stack he made for his SaaS is really amazing, it is following the standards from the ancient age.
> you now are having a business is planned by a paid chatbot, they can shutdown anytime or make it more expensive
Local models are about 25 months behind the current SOTA. If that holds, businesses won't need the paid models for many things.
I suspect you'll (a small-medium business) be able to buy a Claude 4.6-class rack mount device for $6000 by 2030 that does 100 t/s with 1 million token context, which honestly, is probably adequate for an office (front office, back office, executive tier etc) of 10-300 unless you've got more than 4 engineers on staff. That kind of offline device is going to push everyone to provide that kind of cloud-enabled baseline service at very low cost. The Qwen 3.5 series is already showing you can almost (but not quite) squeeze that kind of performance out of consumer hardware. 256/512gb consumer video cards will get us there, eventually, if capacity ever catches up with demand.
[dead]
[dead]
[dead]
[dead]
[dead]
> 40 months
Not counting from 1971s DARPA? Sorry I'm allegric when LLMs being called AI like nothing existed before it.
Could the "LLM" of 1971 DARPA produce working code that it translated from a legacy codebase to Java and this within a short timeframe? ;-)
Doesn’t it all look like child’s play though?
You would need to go back to McCullough, Pitts and von Neumann in the 1940s if you wanted to talk about where it really got started.
The 1970s were a long, dark AI winter, thanks to FUD spread by Minsky and Papert. A lot of recent work could have been done back then despite the lack of good hardware, as seen in the other HN story where the guy trained a transformer on a PDP-11. But the whole field was radioactive after their book came out.