The main SEO problem — you don't know what actually works. You rewrote 50 titles, two weeks later positions rose — was it the rewrites, or a Google core update, or competitors slipping, or seasonality? Unlike ads where classic A/B traffic split works, in SEO traffic split is impossible (you can't serve Google different versions — that's cloaking). So SEO A/B testing uses a different method: you split pages into matched groups, apply the change to one group only, and compare dynamics. This article walks through running such experiments correctly.
What can be SEO-tested
- Title and meta description — rewrite on some pages, keep old on the rest.
- Heading structure (H1, H2) — restructure on half the articles.
- Content length and format — expand articles in one group, leave the control as-is.
- Adding an FAQ block + FAQPage JSON-LD on half the pages.
- Internal links — add contextual links to group A, leave group B untouched.
- Loading speed — optimise LCP on a subset of templates.
How to split pages into groups
The key mistake is grabbing "10 random pages" for the test group. If pages differ by topic, audience or competition, you get no statistics. Right approach: match pairs of pages with similar characteristics. E.g., two categories "smartphones" and "tablets" with similar product counts, current traffic and positions. Apply the change to the first, leave the second alone. After 4–6 weeks compare deltas. Minimum group size for statistics is 20–30 pairs. Less is too noisy; the result may be random.
Experiment duration
Minimum 4 weeks, optimum 6–8. Less and the engine can't re-evaluate the page: indexing + signal recalculation takes 2–3 weeks. More and isolating the effect gets harder because other factors layer on (core update, seasonality, competitor moves). Ideal: 6 weeks of experiment + 2 weeks of "settle" to confirm the effect is stable, not a one-off spike.
Which metrics to track
Main metric — clicks from GSC (for Google) and Webmaster (for Yandex). Not just positions, but clicks specifically — the real signal the page works better. Plus: impressions, average position, CTR. Compute the delta in the test group and the control. If clicks grew 25% in test and 5% in control (maybe seasonal growth), the change drove +20%. If both grew equally — the change did nothing; the lift was from external factors.
Common SEO A/B test mistakes
- Too small a sample — 5–10 pages give no statistics; result is random.
- No control group — without comparison you can't tell change from external factors.
- Test and control groups differ in characteristics — comparing the incomparable.
- Too short an experiment window — the engine didn't re-evaluate.
- Multiple changes applied in parallel — you can't isolate which worked.
A practical experiment example
Hypothesis: adding an FAQ block at the bottom of a blog article lifts positions on long-tail queries. Experiment: pick 40 articles with similar traffic (200–500 clicks/month). Randomly split into 2 groups of 20. In the first group add an FAQ with 5–7 questions + FAQPage JSON-LD. Control stays unchanged. Duration: 6 weeks. Metric: clicks on long-tail (> 5 word) queries. After 6 weeks: test-group long-tail clicks rose 34% on average, control 8%. Net effect +26%. Hypothesis confirmed, roll the FAQ block out to the remaining 200 articles.
Frequently asked
Do I need special tools for SEO A/B tests?
Recommended. Site Metrics Tool lets you tag pages (test/control), and the dashboard auto-computes per-group deltas. Without a dedicated tool you can use GSC + Excel, but each experiment analysis takes 3–4 hours.
How many SEO A/B tests per year make sense?
For one team, 4–8 experiments a year is realistic. Each takes 6–8 weeks of run time and a month of prep. More overlaps effects and loses control.
What if the result sits on the edge of statistical significance?
Extend the experiment by 2–4 weeks or grow the sample. If it stays "on the edge" after expansion — the effect is likely weak and not worth scaling. Test a different hypothesis instead.