The early-stopping trap
Most A/B tests "win" or "lose" within the first 1,000 visitors and look statistically significant for a few hours, then converge to nothing once enough traffic accumulates. This is noise, not signal. Evan Miller's "How Not to Run an A/B Test" shows why peeking at results inflates false positives.
Sample sizes are bigger than you think
To detect a 5% lift on a 3% baseline conversion rate at 95% confidence, you need roughly 10,000 visitors per variant. Many small businesses simply don't get enough monthly traffic for traditional A/B testing on most metrics. For low-traffic sites, sequential analysis or qualitative testing (5-second tests, user interviews) is more honest than under-powered A/B tests.
Document everything
Every test, including losers. Half the value of A/B testing is the cumulative knowledge of what doesn't work for your audience — that's irreplaceable, and most teams throw it away after each test. The A/B testing framework covers what to test in what order.