Activity - I think the most interesting finding in this study is the following:...

Gaywallet OP , 5 months ago

I think the most interesting finding in this study is the following:

The models also suffered from “contra-factual bias": They were likely to believe a false premise embedded in a user’s question, acting in a “sycophantic” way to reinforce the user’s mistake.

Which when you think about how language models work, makes a lot of sense. It's drawing upon trained data sets that match the question being asked. It's easy to lead it to respond a certain way, because people who talk pro/con certain issues will often use specific kinds of language (such as dog whistles in political issues).

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...