We ran fifty queries through Perplexity Pro, ChatGPT Search, and Google's AI Mode over two weeks in April 2026. Five categories, ten queries each: research questions, how-to questions, breaking news, coding questions, and shopping. The goal was not to decide which is "best" in the abstract. It was to figure out which one we would actually open when we had a real question. Spoiler: all three won at something, and one of them lost a category so badly we stopped using it there.

The TL;DR is boring and useful. Perplexity wins research and how-to. ChatGPT Search wins coding and ambiguous queries where the answer depends on context. Google AI Mode wins breaking news and shopping. Paying for all three is $60/month, which is silly for a single person, but a lot of teams should probably be doing it.

How we tested

Fifty queries, same exact wording in each tool, same account day. We scored each response on four axes: factual accuracy (did it get things right), citation quality (were the sources real, relevant, and followable), speed (first-token and total), and usefulness (would we actually get our answer from this response, or would we have to click through and read the sources ourselves).

We ran queries on a freshly cleared browser profile with no personalization signal. We also deliberately re-ran five queries 24 hours later to check how much variance there was. It was a lot. More on that below.

Research questions: Perplexity still wins

This is what Perplexity was built for and it shows. Queries like "what are the main differences between the proposed EU AI Act amendments from Q1 2026 and the original text" produced the best response from Perplexity 8 out of 10 times. The citations are inline, they are mostly from primary sources when primary sources exist, and the response structure (summary, then breakdown, then sources) is the right shape for this kind of question.

ChatGPT Search got a strong second place. Its answers were often slightly more conversational and less structured, which is sometimes what you want and sometimes not. Citation quality was similar, though ChatGPT tended to pull in fewer sources (typically 4-6 versus Perplexity's 8-12).

Google AI Mode came in third. The answers were fine. The citation experience, where it pulls from a mix of indexed pages and surfaces them as small cards, is more awkward to actually read than inline numbered citations. It also hallucinated a source twice in ten queries, which is worse than either competitor.

How-to questions: Perplexity again, but barely

We ran queries like "how do I set up a WireGuard VPN between a Hetzner server and my home network with a dynamic IP." These are the queries where you used to open five Stack Exchange tabs. Perplexity nailed 7 of 10. ChatGPT Search got 6 of 10. Google got 6 of 10. The gap was narrow.

Where Perplexity pulled ahead was in including the "gotcha" considerations from real forum posts. Its sources skew more toward Reddit, Hacker News, and Stack Overflow, which is exactly where the real answers to this kind of question live. ChatGPT Search tends to pull more from blog posts and documentation, which are often out of date.

Breaking news: Google wins, and it is not close

We tested this category hardest because it is where "AI search" has historically been worst. Ten queries about events that had happened within the preceding 24 hours, submitted over the course of a week.

Google AI Mode was right 9 out of 10 times. ChatGPT Search was right 6 out of 10. Perplexity was right 5 out of 10. Twice, Perplexity confidently cited a news story that had been updated or retracted, using the older version. ChatGPT did this once. Google did not do it at all in our sample.

This should not be surprising. Google has the fastest, widest news crawl on earth, and AI Mode is reading directly from that. For anything time-sensitive, open Google.

For news that happened today, there is still only one correct answer and it starts with a G.

Coding questions: ChatGPT Search took the lead

We asked things like "what is the recommended pattern for streaming tool-use results from the Anthropic SDK in a Next.js 16 route handler" and "how do I configure Turborepo 3 with Bun 1.4 for a monorepo with a shared TypeScript config package." These are queries where the right answer is a working code snippet plus an explanation.

ChatGPT Search won 7 of 10. It tends to produce cleaner code blocks, explains the reasoning, and pulls the right patterns from recent GitHub repos and documentation. Perplexity's code was right slightly less often (6 of 10) and the explanations were drier. Google AI Mode did noticeably worse here (5 of 10). It kept surfacing StackOverflow answers from 2022 as if they were current.

Shopping: Google AI Mode, for now

Shopping is the category where the tools diverge the most. Google AI Mode plugs into Google Shopping's catalog, which means it knows real current prices, real inventory, and real seller ratings. When we asked "what is the best sub-$300 mechanical keyboard with hot-swappable switches and a split layout," Google gave us five real products with current prices and links.

Perplexity Shopping (their dedicated mode) is serviceable but limited in inventory and often shows prices that are days out of date. ChatGPT Search's shopping experience is the weakest, often surfacing generic category pages instead of specific products. They are clearly still building this out.

Speed and the thing nobody talks about: variance

Typical end-to-end times in our tests:

Tool              | First token | Full response
------------------|-------------|----------------
Perplexity Pro    |   ~1.2s     |    5-9s
ChatGPT Search    |   ~1.8s     |    6-12s
Google AI Mode    |   ~0.9s     |    3-7s

Google is the fastest, which matters more than people admit. But the more interesting finding was variance. When we re-ran five queries 24 hours later, all three tools gave materially different answers to at least two of them. Different sources cited, different summaries, sometimes different conclusions. This is fine for "explain X" queries. It is a real problem for research you might cite in a decision.

Cost: $20 everywhere, kind of

Perplexity Pro is $20/month. ChatGPT Plus (which includes Search) is $20/month. Google AI Premium is $19.99/month and bundles Drive storage, which complicates the math depending on what you already pay Google. None of these are expensive relative to the value for a knowledge worker. The friction is not cost, it is context switching.

Citation quality, looked at closely

Beyond "did it cite a source," we checked whether the citation actually supported the claim. This is where the tools diverge most sharply. Perplexity's citations supported the adjacent claim 87% of the time in our sample. ChatGPT Search's hit 78%. Google AI Mode's hit 71%. The gap is partly because Perplexity is structured to tie each sentence to its sources more tightly. The other two often summarize across sources and then attribute a paragraph to all of them, which means the citation is technically correct but less useful when you are trying to verify a specific claim.

We also found "orphan citations," sources listed but never actually used, in all three tools. About 1 in 8 sources in Perplexity, 1 in 5 in ChatGPT Search, 1 in 4 in Google AI Mode. Not catastrophic, but worth knowing. If you are doing real research, click through and verify.

Follow-up questions: the conversational dimension

Single-shot queries are only half the use case. The other half is following up: "tell me more about the second point," or "compare that to how the US handles it." ChatGPT Search is the best of the three at this, by a meaningful margin. It holds the conversation thread, remembers the earlier citations, and answers follow-ups coherently.

Perplexity is workable for follow-ups but sometimes forgets what it just told you and does a fresh search, which produces contradictions between answers. Google AI Mode's follow-up behavior is the weakest. It often treats each question as independent, which defeats the point of a conversational interface.

Where each one gets its data, and why it matters

The three tools are not drawing from the same pool, which explains most of the quality differences we saw. Google AI Mode has privileged access to the live Google index, Google News, Google Shopping, and YouTube transcripts. That is a moat no startup can match, and it is why Google wins on news and shopping. It is also why Google sometimes feels strangely old: the index ranks sites by a lot of signals, and recency is not always the dominant one.

Perplexity runs its own crawl plus partnerships with several publishers (Reuters, Axel Springer, and others signed deals throughout 2024 and 2025). Their index is narrower but tuned for the kind of content people actually want cited. ChatGPT Search uses Bing as its underlying index plus OpenAI's own sources. Bing's index is the reason ChatGPT Search often surfaces different results than Perplexity on the same query, and occasionally misses something obvious that Google would catch.

Privacy and ad exposure

This is a space where the three tools differ more than the marketing suggests. Google AI Mode is ad-supported in all but the paid tier, and answers are sometimes shaped by the presence of advertiser content. We did not find egregious cases of this in our testing, but the structural incentive exists. Perplexity is subscription-first, ad-light, and has been clearest about not selling query data for training. ChatGPT Search inherits OpenAI's broader privacy policies, which you should read if you are typing anything sensitive into it.

For professional research, Perplexity is the easiest to defend from a privacy standpoint. For everyday shopping and news, Google is fine, just be aware that commercial queries are commercial.

What we'd actually do

  1. If you pick one, pick Perplexity. It is the best generalist and it is the one we open by default.
  2. If you pay for ChatGPT Plus already, you have Search. Use it for coding and anything where you want a longer, more conversational answer.
  3. Keep Google AI Mode one tab away for news and shopping. It is free, it is fast, and it wins where the others lose.

And do not assume any of them got it right. All three hallucinated at least once in our sample, and all three showed meaningful variance between runs. For anything that matters, click through to the source. These tools are search, not oracles.