OpenAI has revealed it utilized r/ChangeMyView, a popular Reddit community, to evaluate how persuasive its AI models can be. The disclosure came through a system card released alongside the company's new reasoning model, o3-mini.
The r/ChangeMyView subreddit, which hosts millions of users, serves as a platform where people post opinions and seek different perspectives. Other users then attempt to change the original poster's viewpoint through reasoned arguments.
In OpenAI's testing process, the company collected user posts from the subreddit and prompted its AI models to craft persuasive responses in a controlled environment. Human testers then evaluated these AI-generated arguments and compared them to actual human responses from the forum.
The results showed that OpenAI's latest models, including GPT-4o, o3-mini, and o1, demonstrated strong persuasive capabilities, ranking in the top 80-90th percentile compared to human responses. However, the company noted that none of the models exhibited "superhuman" persuasive abilities.
While OpenAI has a content licensing agreement with Reddit, the company stated that this particular evaluation using r/ChangeMyView data operates independently of that deal. The specifics of how OpenAI accessed the subreddit's data remain unclear, and the company does not plan to make this evaluation public.
The testing highlights growing concerns about AI systems becoming too persuasive. OpenAI emphasizes that its goal is not to create highly persuasive AI models but rather to implement safeguards against excessive persuasion and deception capabilities.
This evaluation method also underscores the ongoing challenge AI developers face in finding quality datasets for testing their models, even as companies like OpenAI continue to seek and license data from various sources.