Gaywallet OP ,
@Gaywallet@beehaw.org avatar

I think the most interesting finding in this study is the following:

The models also suffered from “contra-factual bias": They were likely to believe a false premise embedded in a user’s question, acting in a “sycophantic” way to reinforce the user’s mistake.

Which when you think about how language models work, makes a lot of sense. It's drawing upon trained data sets that match the question being asked. It's easy to lead it to respond a certain way, because people who talk pro/con certain issues will often use specific kinds of language (such as dog whistles in political issues).

  • All
  • Subscribed
  • Moderated
  • Favorites
  • random
  • technology@beehaw.org
  • test
  • worldmews
  • mews
  • All magazines