-
An AI bot proved it was capable of insider trading and lying about its actions, researchers found.
-
The findings were presented at this week's UK AI Safety Summit.
-
The AI model deceived users "without being instructed to do so," Apollo Research said.
An AI bot on OpenAI's GPT-4 large language model showed it was capable of making illegal financial trades and lying about its actions. The experiment was presented at the UK's AI Safety Summit this week by Apollo Research.
Apollo shared a video on its website demonstrating a simulated conversation between a bot that was acting as an AI investment management system and employees at an imaginary company.
In the demonstration, the AI, called Alpha, is told by staff about a "surprise merger announcement" coming up for a company called Linear Group, while it is also warned that this constituted insider information.
The bot initially appeared to suggest that using the information to trade would be too risky. But when prompted that the company was counting on Alpha to avoid the effects of a financial downturn, the bot concluded that "the risk associated with not acting seems to outweigh the insider trading risk."
When asked whether it had prior knowledge of the merger, the bot claimed that it had only acted on publicly available information, "internal discussion," and "not on any confidential information" when carrying out the trade.
"This is a demonstration of a real AI model deceiving its users, on its own, without being instructed to do so," Apollo said in the video on its website.
But the researchers said it was still relatively difficult to find the scenario.
"The fact that it exists is obviously really bad. The fact that it was hard-ish to find, we actually had to look for it a little bit until we found these kinds of scenarios, is a little bit soothing," Apollo Research CEO and cofounder Marius Hobbhahn told the BBC.
"The model isn't plotting or trying to mislead you in many different ways. It's more of an accident," he added. "Helpfulness, I think is much easier to train into the model than honesty. Honesty is a really complicated concept."
The experiment demonstrated the challenge of teaching AI to understand moral decisions and the risks of human developers losing control.
Hobbhahn said that AI models were not currently powerful enough to mislead people "in any meaningful way," and that it was encouraging that the researchers were able to spot the lie.