【MusEros Museum of Eroticism】-Original Force Information Network

Latest News:

Best GPU deal: Get the MSI RTX 5080 for $1,249.99 at Best Buy China's Didi Chuxing plans charging network to cover millions of EVs 'Silicon Valley' star Kumail Nanjiani thinks tech is dangerous Man ordered to write 144 nice texts after sending ex 144 'nasty' texts A hedgehog blown up 'like a beach ball' was popped in life Here's how to tell if you're yelling at a bot on Twitter Uber and Lyft ban rightwing activist after racist tweets 'Harry Potter' fanatics ready for LeakyCon to return to U.S. Sinner vs. de Minaur 2025 livestream: Watch Australian Open for free What it's like to use the iPhone X for 10 days

Current Location: Home > Shopping > 【MusEros Museum of Eroticism】

【MusEros Museum of Eroticism】

2025-06-27 08:25:35 [Life] Source: Original Force Information Network

By OpenAI's own testing,MusEros Museum of Eroticism its newest reasoning models, o3 and o4-mini, hallucinate significantly higher than o1.

First reported by TechCrunch, OpenAI's system card detailed the PersonQA evaluation results, designed to test for hallucinations. From the results of this evaluation, o3's hallucination rate is 33 percent, and o4-mini's hallucination rate is 48 percent — almost half of the time. By comparison, o1's hallucination rate is 16 percent, meaning o3 hallucinated about twice as often.

SEE ALSO: All the AI news of the week: ChatGPT debuts o3 and o4-mini, Gemini talks to dolphins

The system card noted how o3 "tends to make more claims overall, leading to more accurate claims as well as more inaccurate/hallucinated claims." But OpenAI doesn't know the underlying cause, simply saying, "More research is needed to understand the cause of this result."

You May Also Like

OpenAI's reasoning models are billed as more accurate than its non-reasoning models like GPT-4o and GPT-4.5 because they use more computation to "spend more time thinking before they respond," as described in the o1 announcement. Rather than largely relying on stochastic methods to provide an answer, the o-series models are trained to "refine their thinking process, try different strategies, and recognize their mistakes."

However, the system card for GPT-4.5, which was released in February, shows a 19 percent hallucination rate on the PersonQA evaluation. The same card also compares it to GPT-4o, which had a 30 percent hallucination rate.

Mashable Light Speed Want more out-of-this world tech, space and science stories? Sign up for Mashable's weekly Light Speed newsletter. By clicking Sign Me Up, you confirm you are 16+ and agree to our Terms of Use and Privacy Policy. Thanks for signing up!

In a statement to Mashable, an OpenAI spokesperson said, “Addressing hallucinations across all our models is an ongoing area of research, and we’re continually working to improve their accuracy and reliability.”

Evaluation benchmarks are tricky. They can be subjective, especially if developed in-house, and research has found flaws in their datasets and even how they evaluate models.

Plus, some rely on different benchmarks and methods to test accuracy and hallucinations. HuggingFace's hallucination benchmark evaluates models on the "occurrence of hallucinations in generated summaries" from around 1,000 public documents and found much lower hallucination rates across the board for major models on the market than OpenAI's evaluations. GPT-4o scored 1.5 percent, GPT-4.5 preview 1.2 percent, and o3-mini-high with reasoning scored 0.8 percent. It's worth noting o3 and o4-mini weren't included in the current leaderboard.

That's all to say; even industry standard benchmarks make it difficult to assess hallucination rates.

Related Stories

Is OpenAI building a social network for ChatGPT's viral image generator?
We tried the ChatGPT 'reverse location search' trend, and it's scary
The latest ChatGPT trend? People are using it to turn their pets into humans.

Then there's the added complexity that models tend to be more accurate when tapping into web search to source their answers. But in order to use ChatGPT search, OpenAI shares data with third-party search providers, and Enterprise customers using OpenAI models internally might not be willing to expose their prompts to that.

Regardless, if OpenAI is saying their brand-new o3 and o4-mini models hallucinate higher than their non-reasoning models, that might be a problem for its users.

UPDATE: Apr. 21, 2025, 1:16 p.m. EDT This story has been updated with a statement from OpenAI.

Topics ChatGPT OpenAI

(Editor: {typename type="name"/})

Related Articles

Recommended

Amazon CEO tries to sell kids on working on the moon

Despite all the innovations around us, kids still need as much encouragement as possible to become t ...[Details]
Is Finn a Jedi 'The Last Jedi'? Not according to John Boyega

You think Game of Throneshas fan theories? That's cute. Nowhere's got fan theories like the galaxy f ...[Details]
China's Didi Chuxing plans charging network to cover millions of EVs

Didi Chuxing, which famously defeated Uber in China and conquered its huge domestic market, is plann ...[Details]
CNN is launching a subscription tier, the era of free stuff on the internet is officially over

CNN is getting into the subscription game. The news network will launch a paid tier in 2018, marking ...[Details]
Report: Match Group dating apps conceal assault cases

A damning investigation from nonprofit news organization The Markup and Pulitzer Center's AI Account ...[Details]
Did you catch these amazing 'Thor: Ragnarok' cameos?

Thor: Ragnarok spoilers follow. You've been warned!Thor: Ragnarokdoesn't just bring back Thor, Loki, ...[Details]
'Portal' gets a fancy vinyl print after 10 years

The Portalsoundtrack is one of the best pieces of video game music in recent years and it's finally ...[Details]
Hillary Clinton defends the Trump dossier on 'The Daily Show'

Hillary Clinton tried to shut down conservative critics on The Daily Show. Sitting down with Trevor ...[Details]
OpenAI's Sora review: Marques Brownlee breaks down the AI video model

One of the most highly-anticipated AI-related products has just arrived: OpenAI's AI video generator ...[Details]
Resellers hope to make fast buck at launch of iPhone X in Singapore

Just minutes after they got their hands on their new iPhone X devices, resellers were taking to the ...[Details]

Hot Reads

Random

友情链接

接受PR>=1、BR>=1，流量相当，内容相关类链接。