“If man chooses oblivion, he can go right on leaving his fate to his political leaders. If he chooses Utopia, he must initiate an enormous education program - immediately, if not sooner.”
-R B Fuller

  • 0 Posts
  • 8 Comments
Joined 5 months ago
cake
Cake day: August 25th, 2025

help-circle




  • Is your app as efficient as what an experienced developer would create?

    One of the earliest uses we had for LLMs was literally just asking it to optimize several large codebases. Lots of pointless changes suggested; several huge performance wins we had overlooked.

    And all done – implemented, tested, and human-reviewed – in about a person-week, compared to at least half a dozen person-months to go through all that by hand.

    I mean, sometimes the LLMs generate slow algos. But less often than human coders.

    If you released the source code, would it have security vulnerabilities?

    You’re not gonna believe this, but another of the first things we did was ask the LLMs to review the codebase for security issues (and review any new PRs)

    OFC the code also gets reviewed for security vulns like it always has, by old-school automation (eg valgrind, fortify, yadda), human review, and red-teaming exercises. I don’t think I’ve seen enough data yet to say whether it’s got more/worse security issues than human-generated code (which, need I remind you, is often highly insecure)

    These are just a couple of the more hidden issues that fly under the radar when shipping LLM-generated code. Ummm… those would be issues if you didn’t use good orchestration, didn’t have good tools and docs for the LLMs to use, didn’t have follow good software engineering practices to begin with…



  • There’s no such thing as “agents”

    Up until ~6 months ago I would have agreed with you, and elaborated that “Agents are just LLMs in a loop with a text file scratchpad”

    That’s… still true in a way, but honestly so many people have put so much cleverness into managing that process, that I have to say, yes, Cline or Codex with GPT or Claude Code behind them are absolutely “agentic”.

    I can point them to a problem report and our company documentation and… an ever-increasing percentage of the time, I wind up with a problem description, a patch that fixes it, unit, coverage, and stress tests, and (if relevant) updated docs.


  • This reminds me so much of the late-80s when everyone was installing PCs in their offices and everyone was asking if this is actually better than a typewriter and Rolex, because people spend all day “futzing” on the computer.

    5 years later we had networking, emerging interoperability standards, office productivity suites. 10 years later there was basically no company left that didn’t have PCs and much better productivity.

    I see the same thing playing out here. A year ago we had Copilot and it sucked, I didn’t see the utility. But now coding agents with skills can easily read and understand specs, create testsuites, etc. These are right now revolutionizing my team’s work.

    You see this pattern over & over with AI capability on a given task: It’s pathetic at 5%, then it merely sucks at 40%, then it takes a lot of futzing to fix up at 70%, then suddenly it’s at 95% and does as well as most professionals.

    Downvote me to hell, this is my honest assessment.