A/B testing is the lifeblood of technology companies. They experiment, fail, and quickly implement what works and what does not work. They believe it as if it were the C-suite making decisions, and they must follow the A/B testing results. Experts at many marketing events in the USA have suggested that you pay less attention to the "sample size calculator" and the "paired t-test result."
The more you automate your A/B testing, the more you'll think, "this works like a charm, and if you get a 1% uplift, this pays for salary, right?"
Encourage signups, convert sales, embed fillable forms, and collect email addresses: Optimizing a website can be time-consuming and costly.
As reviewed at the Marketing 2.0 Conference , A/B tests and logical statements are now used to build landing pages and websites. In practice, every journey must be optimized by a human analyzing data and deciding whether "image 1" is better than "image 2." This time-consuming process is ineffective at determining which content combinations work well together and limits the granularity of audience sizes. Furthermore, once marketers have reached the industry average conversion rate or whatever KPI their boss has set, they tend to experiment less. Overall, this procedure is time-consuming and yields mediocre results.
So, what happens in practice? What explains why, after a year of hard work, you implemented 100 feature changes with a 1% uplift and your final metric moved by 3%? Let us take a look:
There are too many A/B tests running.
Have you ever heard the phrase, "we have a fantastic traffic splitting system that can run hundreds of experiments simultaneously"? It works in many cases until you have many experiments, one of which is not completely randomized. In practice, most traffic splitting occurs on a subset of the population with some mutual exclusion. However, imagine you have an ML algorithm serving recommendations and experimenting on the service selection menu directly above this recommendation.
Holding all variables constant and testing a single piece of content, such as a call to action (CTA) or image, against another piece of content does not provide answers about the actually best-performing combination.
A/B testing is frequently used to test various business, operational, and technical metrics. Consider the case of a new feature whose primary metric is "Revenue." It is physically impossible (in terms of time and traffic) to A/B test enough to determine the best combination for each audience. A/B tests are also ineffective at capturing the constantly changing visitor sentiment. Also, long-term retention A/B testing is challenging to observe.
A/B Testing Result: One Size Fits All
Just a friendly reminder that A/B testing is a winner-take-all method. It is not personalized. It does not indicate the resulting behavior.
A/B testing teaches you very little: "We do A/B testing to learn about the user," and the frequent answer is: "Really? What did you discover? You discovered that Version A outperformed Version B in terms of conversion rate. This is not an important lesson. You didn't learn much about the users individually, and you didn't know much about the version you tested.
A/B testing is not a single-cause algorithm. It simply indicates that a variant performed better on average in the population under the current conditions—global marketing events in 2022, like the Marketing 2.0 Conference, sheds light on the importance of A/B testing and possibly its alternative.