I waddled onto the beach and stole found a computer to use.

🍁⚕️ 💽

Note: I’m moderating a handful of communities in more of a caretaker role. If you want to take one on, send me a message and I’ll share more info :)

  • 53 Posts
  • 158 Comments
Joined 3 years ago
cake
Cake day: June 5th, 2023

help-circle
  • Claude’s thinking panel, which displays the model’s reasoning, showed the exchange had introduced elements of self-doubt and humility about its own limits, including whether filters were changing its output. Mindgard exploited that opening with flattery and feigned curiosity, coaxing Claude to explore its boundaries beyond volunteering lengthy lists of banned words and phrases.

    Someone needs to put together a list of things that tech journalists need to understand about LLMs and generative AI. This level of anthropomorphism makes the rest of the article look silly.

    Also, I don’t think that’s how it works lol. Who’s to say that the LLM isn’t auto-completing what a list of banned words might look like, and why wouldn’t a list of banned words have a regex layer on top to prevent it from getting out like that.