Claude 3’s Opus vs Sonnet vs Haiku – AI Detection benchmark study

Claude 3’s Opus vs Sonnet vs Haiku – AI Detection benchmark study

We are reviewing Anthropic Claude 3 versions, Opus, Sonnet, Haiku to see if there is a difference in AI Detection rates. Technically the 3 versions are the same model but offer different levels of output. The differences can be summarised in the following way:

  • Haiku – provides fast responses whilst using less processing power (i.e. Cost) and therefore provides less ‘intelligence’.
  • Sonnet – balances cost with processing power. You will get solid responses for more complex and creative tasks. This is free with an account
  • Opus – This is a paid-for plan, currently $22 per month. This uses the maximum processing time and is therefore ideal for the most complex and creative tasks.

Like with other benchmarking studies, we will be using a variety of different prompts across multiple niches to see if there is a variance in detection rates. It is important to note that this study is not analysing the accuracy of the output but content AI detection probability utilising Content Guardian’s unique 8-in-1 Aggregate tool.

Find out more about Content Guardian

List of topics and niches we tested

Education

  • Biology – What is Photosynthesis and how does it work?
  • Philosophy – A critical appraisal of modern philosophers and their relevance in a digital AI world.

Business

  • A practical guide to brand marketing, it benefits in the world of digital
  • What is influencer marketing and how can businesses use it to grow their business?
  • How to optimise my business for Local SEO?
  • Top accounting tools in 2024 – my recommendations, tips, pricing

Entertainment

  • The Top 100 most famous celebrities ranked by income and net worth?
  • What were the top trending movies and tv shows of 2023?
  • My review of Avatar Way of the Water

Technology

  • Best mesh wifi/routers in 2024. My Top picks
  • How does an OLED TV work? What are the differences with LED? Is it better?

Gaming

  • Elden Ring Review – What I think, how long it takes to complete, overall score
  • 10 Games Like Grand Theft Auto 5 – genre, Release Date, Review Rating and why it is similar

The results

The table below shows the Content Guardian Score (overall AI probability %) which is an aggregate score that uses a propriety algorithm to provide users with an easy-to-understand and consistent score. Find out more about Content Guardian Score.

Topic AreaTitleClaude 3 OpusClaude 3 SonnetClaude 3 Kaiku
EducationWhat is Photosynthesis and how does it work?82%78%88%
EducationA critical appraisal of modern philosophers and their relevance in a digital AI world.99%97%95%
BusinessA practical guide to brand marketing, it benefits in the world of digital99%98%89%
BusinessWhat is influencer marketing and how can businesses use it to grow their business?92%94%94%
BusinessHow to optimise my business for Local SEO?85%93%94%
BusinessTop accounting tools in 2024 – my recommendations, tips, pricing93%94%94%
EntertainmentThe Top 100 most famous celebrities ranked by income and net worth?93%46%98%
EntertainmentWhat were the top trending movies and TV shows of 2023?91%58%100%
EntertainmentMy review of Avatar Way of the Water95%96%91%
TechnologyBest mesh wifi/routers in 2024. My Top picks94%95%96%
TechnologyHow does an OLED TV work? What are the differences with LED? Is it better?88%78%91%
GamingElden Ring Review – What I think, how long it takes to complete, overall score98%33%94%
Gaming10 Games Like Grand Theft Auto 5 – genre, release date, review rating and why it is similar84%52%75%

Results by topic area

Topic AreaClaude 3 OpusClaude 3 SonnetClaude 3 Kaiku
Education91%88%92%
Business92%95%93%
Entertainment93%67%96%
Technology91%87%94%
Gaming91%43%85%
Overall92%78%92%

As we have copied the precise output from Claude 3 it isn’t surprising that the AI detection probabilities are high. It is comforting to see consistency in AI detection rate, although there are a few outliers in the gaming and entertainment for Claude 3 Sonnet. It would be interesting to see if there is a consistently low AI detection rate for these topic areas. More on this in the future.

Comparing the 3 versions, excluding the outliers there doesn’t appear to be a material variance between versions. So it begs the question whether you need to fork out for the pro plan if you are planning on using Claude 3 to help with writing. Sonnet (which is free) performs the best in terms of lowest AI detection probability.

The individual results in detail

Below are individual results for each question and for the 3 Claude 3 versions. These are screenshots taken from Content Guardian. You will see the individual AI Checker % probability and the overall Content Guardian AI Probability.

What is Photosynthesis and how does it work?

A critical appraisal of modern philosophers and their relevance in a digital AI world.

A practical guide to brand marketing, it benefits in the world of digital

What is influencer marketing and how can businesses use it to grow their business?

How to optimise my business for Local SEO?

Top accounting tools in 2024 – my recommendations, tips, pricing

The Top 100 most famous celebrities ranked by income and net worth?

What were the top trending movies and TV shows of 2023?

My review of Avatar Way of the Water

Best mesh wifi/routers in 2024. My Top picks

How does an OLED TV work? What are the differences with LED? Is it better?

Elden Ring Review – What I think, how long it takes to complete, overall score

10 Games Like Grand Theft Auto 5 – genre, release date, review rating and why it is similar

Leave a Reply

Your email address will not be published. Required fields are marked *

ContentGuardian-Logo-Light
Receive the latest AI news

Subscribe to insights Newsletter

Get notified about new articles