I think the most interesting finding in this study is the following:
The models also suffered from “contra-factual bias": They were likely to believe a false premise embedded in a user’s question, acting in a “sycophantic” way to reinforce the user’s mistake.
Which when you think about how language models work, makes a lot of sense. It's drawing upon trained data sets that match the question being asked. It's easy to lead it to respond a certain way, because people who talk pro/con certain issues will often use specific kinds of language (such as dog whistles in political issues).