For a long time, responsive design dominated the web as the format of choice for business and personal sites. Now, however, mobile optimization has begun to gain credence as a potentially preferable strategy. Mobile optimization refers to optimizing a website specifically for mobile devices. Instead of simply compressing and slightly rearranging the content on the screen, you design the entire experience for smaller screens. You’ve probably heard the term “mobile-friendly.” It’s a bit outdated, so even though it sounds like a good thing, it’s not enough. People are using their mobile devices more and more, as I’ll explain in a…
If you want to craft a delightful marketing experience and you’re using popups, you need to make sure you hold them to the same high standards as the content they are covering up. You can learn a lot by looking at bad website popup examples.
Once you understand what not to do, you’ll default to starting your own popup designs from a better baseline.
What does a bad popup design actually look like?
Well, it depends on your judging criteria, and for the examples below, I was considering these seven things, among others:
Clarity: Is it easy to figure out the offer really quickly?
Relevance: Is it related to the content of the current page?
Manipulation: Does it use psychological trickery in the copy?
Design: Is it butt ugly?
Control: Is it clear what all options will do?
Escape: Can you get rid of it easily?
Value: Is the reward worth more than the perceived (or actual) effort?
The following popup examples, each make a number of critical errors in their design decisions. Take a look, and share your own worst popup design examples in the comments!
#1 – Mashable Shmashable
What’s so bad about it?
If you peer into the background behind the popup, you’ll see a news story headline that begins with “Nightmare Alert”. I think that’s a pretty accurate description of what’s happening here.
Design: Bad. The first thing I saw looks like a big mistake. The Green line with the button hanging off the bottom looks like the designer fell asleep with their head on the mouse.
Clarity: Bad. And what on earth does the headline mean? click.click.click. Upon deeper exploration, it’s the name of the newsletter, but that’s not apparent at all on first load.
Clarity: worse. Then we get the classic “Clear vs. Clever” headline treatment. Why are you talking about the pronunciation of the word “Gif”? Tell me what this is, and why I should care to give you my email.
Design: Bad. Also, that background is gnarly.
#2 – KAM Motorsports Revolution!
What’s so bad about it?
It’s motorsports. It’s not a revolution. Unless they’re talking about wheels going round in circles.
Clarity: Bad. The headline doesn’t say what it is, or what I’ll get by subscribing. I have to read the fine print to figure that out.
Copy: Bad. Just reading the phrase “abuse your email” is a big turn off. Just like the word spam, I wasn’t thinking that you were going to abuse me, but now it’s on my mind.
Relevance: Bad. Newsletter subscription popups are great, they have a strong sense of utility and can give people exactly what they want. But I don’t like them as entry popups. They’re much better when they use an exit trigger, or a scroll trigger. Using a “Scroll Up” trigger is smart because it means they’ve read some of your content, and they are scrolling back up vs. leaving directly, which is another micro-signal that they are interested.
#3 – Utterly Confused
(Source unknown – I found it on confirmshaming.tumblr.com)
What’s so bad about it?
I have no earthly clue what’s going on here.
Clarity: Bad. I had to re-read it five times before I figured out what was going on.
Control: Bad. After reading it, I didn’t know whether I would be agreeing with what they’re going to give me, or with the statement. It’s like an affirmation or something. But I have no way of knowing what will happen if I click either button. My best guess after spending this much time writing about it is that it’s a poll. But a really meaningless one if it is. Click here to find out how many people agreed with “doing better”…
It ends with “Do Better”. I agree. They need to do a lot better.
#4 – Purple Nurple
What’s so bad about it?
Manipulation: Bad. Our first “Confirm Shaming” example. Otherwise known as “Good Cop / Bad Cop”. Forcing people to click a button that says “Detest” on it is so incongruent with the concept of a mattress company that I think they’re just being cheap. There’s no need to speak to people that way.
I found a second popup example by Purple (below), and have to give them credit. The copy on this one is significantly more persuasive. Get this. If you look at the section I circled (in purple), it says that if you subscribe, they’ll keep you up to date with SHIPPING TIMES!!! Seriously? If you’re going to email me and say “Hey Oli, great news! We can ship you a mattress in 2 weeks!”, I’ll go to Leesa, or Endy, or one of a million other Casper copycats.
#5 – Hello BC
What’s so bad about it?
Context: This is an entry popup, and I have never been to this site before.
Relevance: Bad. The site is Hellobc.com, the title says “Supernatural British Columbia”, and the content on the page is about skydiving. So what list is this for? And nobody wants to be on a “list”, stop saying “list”. It’s like saying email blast. Blast your list. If you read the first sentence it gets even more confusing, as you’ll be receiving updates from Destination BC. That’s 4 different concepts at play here.
Design: Bad. It’s legitimately butt ugly. I mean, come on. This is for Beautiful Supernatural British Columbia ffs. It’s stunning here. Show some scenery to entice me in.
Value: Bad. Seeing that form when I arrive on the page is like a giant eff you. Why do they think it’s okay to ask for that much info, with that much text, before I’ve even seen any content?
Control: Bad. And there’s not any error handling. However, the submit button remains inactive until you magically click the right amount of options to trigger it’s hungry hungry hippo mouth to open.
Well, that’s all for today, folks. You might be wondering why there were so few popup examples in this post. Honestly, when the team was rallying to find me a bunch of examples, we all struggled to find many truly awful ones. We also struggled to find many really awesome ones.
This is where YOU come in!
Send me your terrible and awesome popup examples!
If you have any wonderfully brutal, or brutally wonderful examples of website popup design, I’d really appreciate a URL in the comments. If you could share the trigger details too that would be rad (e.g. exit, entrance, scroll, delay etc.).
Tomorrow’s Post is about Awesome Popup Examples! YAY.
So get your butt back here same time tomorrow, where I’ll be sharing my brand new Popup Delight Equation that you can use to grade your own popup designs.
Today I want to talk a bit about what it’s like, as a marketer, to be marketing something that’s difficult to market.
You see, there’s a common problem that many marketers face, and it’s also one of the most asked questions I hear when I’m on the road, as a speaker:
“How do I great marketing for a boring product or service?”
That’s a tough challenge for sure, although the good news is that if you can inject some originality you’ll be a clear winner, as all of your competitors are also boring. However, I think I can one-up that problem:
“How do I do great marketing for something that’s universally hated, like popups?”
We knew we had a big challenge ahead of us when we decided to release the popups product because of the long legacy of manipulative abuse it carries with it.
In fact, as the discussion about product direction began in the office, there were some visceral (negative) reactions from some folks on the engineering team. They feared that we were switching over to the dark side.
It makes sense to me that this sentiment would come from developers. In my experience, really good software developers have one thing in common. They want to make a difference in the world. Developers are makers by design, and part of building something is wanting it to have a positive impact on those who use it.
To quell those types of fears requires a few things;
Education about the positive use cases for the technology,
Evidence in the form of good popup examples, showcasing how to use them in a delightful and responsible manner,
Features such as advanced triggers & targeting to empower marketers to deliver greater relevance to visitors,
And most important of all – it requires us to take a stance. We can’t change the past unless we lead by example.
It’s been my goal since we started down this path, to make it clear that we are drawing a line in the sand between the negative past, and a positive future.
Which is why we initially launched with the name “Overlays” instead of popups.
Overlays vs. Popups – The End of an Era
It made a lot of sense at the time, from a branding perspective. Through podcast interviews and public speaking gigs, I was trying to change the narrative around popups. Whenever I was talking about a bad experience, I would call it a popup. When it was a positive (and additive) experience, I’d call it an overlay. It was a really good way to create a clear separation.
I even started to notice more and more people calling them overlays. Progress.
Unfortunately, it would still require a lot of continued education to make a dent in the global perception of the terminology, that with the search volume for “overlays” being tiny compared to popups, factored heavily into our decision to pivot back to calling a popup a popup.
Positioning is part of a product marketer’s job – our VP of Product Marketing, Ryan Engley recently completed our most recent positioning document for the new products. Just as the umbrella term “Convertables” we had been using to include popups and sticky bars had created confusion, “Overlays” was again making the job harder than it should have been. You can tell, just from reading this paragraph alone that it’s a complex problem, and we’re moving in the right direction by re-simplifying.
The biggest challenge developing our positioning was the number of important strategic questions that we needed to answer first. The market problems we solve, for who, how our product fits today with our vision for the future, who we see ourselves competing with, whether we position ourselves as a comprehensive platform that solves a unique problem, or whether we go to market with individual products and tools etc. It’s a beast of an undertaking.
My biggest lightbulb moment was working with April Dunford who pushed me to get away from competing tool-to-tool with other products. She said in order to win that way, you’d have to be market leading in every tool, and that won’t happen. So what’s the unique value that only you offer and why is it important?
Let’s get back to the subject of popups. I think it’s important to look back at the history of this device to better understand how they came about, and why they have always caused such a stir.
Browser Interaction Models & the History of the Popup
The talk I was doing much of last year was called Data-Driven Design. As part of the talk, I get into interaction design trends. I’ve included the “Trendline” slide below.
You can see that the first occurrence of a popup was back in 1998. Also, note that I included Overlays in late 2016 when we first started that discussion.
Like many bad trends, popups began as web developers started trying to hack browser behavior to create different interruptive interaction modes. I know I made a lot of them back in the day, but I was always doing it to try to create a cool experience. For example, I was building a company Intranet and wanted to open up content in a new window, resize it, and stick it to the side of the screen as a sidebar navigation for the main window. That was all good stuff.
Tabbed browsers have done a lot to help clean up the mess of multiple windows, and if you couple that with popup blockers, there’s a clear evolution in how this type of behavior is being dealt with.
Then came the pop-under, often connected to Malware virus schemes where malicious scripts could be running in the background and you wouldn’t even know.
And then the always fun “Are you sure you want to do that?” Inception-like looping exit dialogs.
So we have a legacy of abuse that’s killed the perception of popups.
What if Popups Had Been Built Into Browsers?
Imagine for a moment that a popup was simply one of many available interaction models available in the browsing experience. They could have had a specification from the W3C, with a set of acceptable criteria for display modes. It would be an entirely different experience. Sure, there would still be abuse, but it’s an interesting thought.
This is why it’s important that we (Unbounce and other like-minded marketers and Martech software providers) take a stance, and build the right functionality into this type of tool so that it can be used responsibly.
Furthermore, we need to keep the dialog going, to educate the current and future generations of marketers that to be original, be delightful, be a business that represents themselves as professionals, means taking responsibility for our actions and doing everything we can to take the high road in our marketing.
I’ll leave you with this thought:
Technology is NOT the problem, We Are.
It’s the disrespectful and irresponsible marketers who use manipulative pop-psychology tactics for the sake of a few more leads, who are the problem. We need to stop blaming popups for bad experiences, and instead, call out the malicious marketers who are ruining it for those trying to do good work.
It’s a tough challenge to reverse years of negative perception, but that’s okay. It’s okay because we know the value the product brings to our customers, how much extra success they’re having, and because we’ve built a solution that can be configured in precise ways that make it simple to use in a responsible manner (if you’re a good person).
Get your butt back here tomorrow to see 20+ delightful website popup examples. More importantly, I’ll also be sharing “The Delight Equation”, my latest formula for measuring quantifying how good your popups really are.
The Indian Independence day is right around the corner. For consumers in India, it’s a day of rejoice and celebration. And, for marketers, it opens a box of opportunities.
For marketers, the opportunity to leverage spirit of Independence translates into consumers’ buying decision for marketers.
In India, especially during major festivals and occasions like Independence Day, you can expect cutthroat rivalry among major brands. And yet, there are big winners in such intense situations.
How does this happen?
What are the strategies and tactics that these brands deploy to successfully pull off a nationwide campaign?
We studied various campaigns of India’s largest online brands to find out the answer.
And we saw that there were five different ploys deployed to pique the interest of the average online consumer in India that resulted in the success of these campaigns.
1. Tapping into consumers’ emotions
Independence Day is the time of the year when citizens are filled with joy and hopes for prosperity for the whole nation. Marketers very well understand these emotions and know how to leverage these to their advantage.
A fitting example would be the outstation campaign by Ola, one of the largest cab aggregator in India.
When the Independence day is close to a weekend, people love to travel a lot. Weekend getaways are popular among the public, and folks love to spend time with their friends and relatives at places nearby.
Ola appealed to its customers’ emotions by offering them outstation deals during the Independence week. The company even offered an INR 300 discount for its first-time outstation users. Ola also partnered with Club Mahindra and Yatra to offer deals on hotel stays.
Ola encourages taking a holiday while thinking about it as a viable brand for traveling to nearby getaways.
2. Limited Period Offer
The sad part of these festive sales and offers is that these need to end after a short span. These campaigns generally run from 2 to 5 days around the festival.
For example, Flipkart Freedom Sale which celebrates India’s spirit of Independence only ran for 4 days, so people had limited time to buy what they wanted to.
Most consumers plan their purchases for such special occasions to get the best deals for the intended product. For others, marketing events, sales, and giveaways always take place with an expiration date.
Setting up such a trigger pushes prospective buyers to make purchases fast, to avoid missing out on the deals.
3. Creating a Sense of Urgency with the help of Micro Events
Some brands build upon the limited nature of the sale and go out all guns blazing to create a sense of urgency.
On top of the limited nature of the sale event, there are few micro-events incorporated into the sale that runs for a few hours to minutes. These sales are exclusive to people who can decide and act fast as they come with an additional discount.
Amazon does this very well with their lightning deals, which generally last from 2-6 hours throughout the event (which itself is 4-day long). The lightning deals have an additional discount on an already stated discount. The catch is the limited time and the sense of urgency it creates.
If people have to buy a product which has a lightning deal, they can add it to their cart and checkout under 15 minutes or the deal is gone forever.
4. Exclusive Product Launch
These festive events also leverage their audience’s interest by providing exclusive product offers during a sale.
It is highly useful to build anticipation among shoppers. And, in India, Amazon attracted consumers from the smartphone market. India is known as the mobile-first country, where over half the population owns a smartphone.
Amazon saw huge boosts in sales due to Smartphone and had exclusive launch of various devices such as Blackberry KeyOne, LG Q6, and the Oneplus 5’s Soft Gold variant. The result was a massive 10X increase in the sales for Amazon through just their Big Indian Sale Event.
5. Omnichannel Promotion and User Experience
Most major brands understand their users and customers. India is predominantly a mobile-first market with a decent penetration when it comes to computers. People love to shop using their mobile devices as well as use their laptops or PCs to make a purchase.
And most users want omnichannel access to the brand of their choice. We saw that a major chunk of brands embraced this philosophy over the Independence week.
For instance, my primary communication happens on my cell phone and brands saw my interaction on cell phones were far more than the email or website and therefore most of the promo I received was over mobile push or in-app rather than through email or website.
Also, there were deals that promoted usage of multiple channels to buy products. Grofers offered an INR 100 discount to shoppers who were open to buying stuff using their mobile app.
Appeal to Your Customers’ Emotions; Don’t Stop Experimenting
Customers are spoilt for choices when the whole nation is celebrating. In these times, marketers need not be intimidated or overwhelmed by their customers. They have to leverage these emotions and keeping building experiences with the help of experimentation.
These are major strategies that have been successfully demonstrated by brands to be effective. You need to understand emotional cues of your customers and accordingly create an effective campaign.
By tapping into your customer’s cognitive tendencies, you can build healthy, long-term relationships with your customers.
HTTPS is a must for every website nowadays: Users are looking for the padlock when providing their details; Chrome and Firefox explicitly mark websites that provide forms on pages without HTTPS as being non-secure; it is an SEO ranking factor; and it has a serious impact on privacy in general.
Additionally, there is now more than one option to get an HTTPS certificate for free, so switching to HTTPS is only a matter of will.
There’s a technique for improving one’s user interface design skills that is the most efficient way I know of expanding one’s visual vocabulary but that I’ve rarely heard mentioned by digital designers.
What’s going on here?
I’m talking about copywork. Copywork is a technique that writers and painters have been using for centuries. It is the process of recreating an existing work as closely as possible in order to improve one’s skill.
As the saying goes, “A picture is worth a thousand words.” Human beings are highly visual creatures who are able to process visual information almost instantly; 90 percent of all information that we perceive and that gets transmitted to our brains is visual. Images can be a powerful way to capture users’ attention and differentiate your product. A single image can convey more to the observer than an elaborate block of text.
A few weeks ago, a Fortune 500 company asked that I review their A/B testing strategy.
The results were good, the hypotheses strong, everything seemed to be in order… until I looked at the log of changes in their testing tool.
I noticed several blunders: in some experiments, they had adjusted the traffic allocation for the variations mid-experiment; some variations had been paused for a few days, then resumed; and experiments were stopped as soon as statistical significance was reached.
When it comes to testing, too many companies worry about the “what”, or the design of their variations, and not enough worry about the “how”, the execution of their experiments.
Don’t get me wrong, variation design is important: you need solid hypotheses supported by strong evidence. However, if you believe your work is finished once you have come up with variations for an experiment and pressed the launch button, you’re wrong.
In fact, the way you run your A/B tests is the most difficult and most important piece of the optimization puzzle.
There are three kinds of lies: lies, damned lies, and statistics.
– Mark Twain
In this post, I will share the biggest mistakes you can make within each step of the testing process: the design, launch, and analysis of an experiment, and how to avoid them.
This post is fairly technical. Here’s how you should read it:
If you are just getting started with conversion optimization (CRO), or are not directly involved in designing or analyzing tests, feel free to skip the more technical sections and simply skim for insights.
If you are an expert in CRO or are involved in designing and analyzing tests, you will want to pay attention to the technical details. These sections are highlighted in blue.
Mistake #1: Your test has too many variations
The more variations, the more insights you’ll get, right?
Not exactly. Having too many variations slows down your tests but, more importantly, it can impact the integrity of your data in 2 ways.
First, the more variations you test against each other, the more traffic you will need, and the longer you’ll have to run your test to get results that you can trust. This is simple math.
But the issue with running a longer test is that you are more likely to be exposed to cookie deletion. If you run an A/B test for more than 3–4 weeks, the risk of sample pollution increases: in that time, people will have deleted their cookies and may enter a different variation than the one they were originally in.
Within 2 weeks, you can get a 10% dropout of people deleting cookies and that can really affect your sample quality.
The second risk when testing multiple variations is that the significance level goes down as the number of variations increases.
For example, if you use the accepted significance level of 0.05 and decide to test 20 different scenarios, one of those will be significant purely by chance (20 * 0.05). If you test 100 different scenarios, the number goes up to five (100 * 0.05).
In other words, the more variations, the higher the chance of a false positive i.e. the higher your chances of finding a winner that is not significant.
Google’s 41 shades of blue is a good example of this. In 2009, when Google could not decide which shades of blue would generate the most clicks on their search results page, they decided to test 41 shades. At a 95% confidence level, the chance of getting a false positive was 88%. If they had tested 10 shades, the chance of getting a false positive would have been 40%, 9% with 3 shades, and down to 5% with 2 shades.
You can calculate the chance of getting a false positive using the following formula: 1-(1-a)^m with m being the total number of variations tested and a being the significance level. With a significance level of 0.05, the equation would look like this:
1-(1-0.05)^m or 1-0.95^m.
You can fix the multiple comparison problem using the Bonferroni correction, which calculates the confidence level for an individual test when more than one variation or hypothesis is being tested.
Wikipedia illustrates the Bonferroni correction with the following example: “If an experimenter is testing m hypotheses, [and] the desired significance level for the whole family of tests is a, then the Bonferroni correction would test each individual hypothesis at a significance level of a/m.
For example, if [you are] testing m = 8 hypotheses with a desired a = 0.05, then the Bonferroni correction would test each individual hypothesis at a = 0.05/8=0.00625.”
In other words, you’ll need a 0.625% significance level, which is the same as a 99.375% confidence level (100% – 0.625%) for an individual test.
The Bonferroni correction tends to be a bit too conservative and is based on the assumption that all tests are independent of each other. However, it demonstrates how multiple comparisons can skew your data if you don’t adjust the significance level accordingly.
The following tables summarize the multiple comparison problem.
Probability of a false positive with a 0.05 significance level:
Adjusted significance and confidence levels to maintain a 5% false discovery probability:
In this section, I’m talking about the risks of testing a high number of variations in an experiment. But the same problem also applies when you test multiple goals and segments, which we’ll review a bit later.
Each additional variation and goal adds a new combination of individual statistics for online experiments comparisons to an experiment. In a scenario where there are four variations and four goals, that’s 16 potential outcomes that need to be controlled for separately.
Some A/B testing tools, such as VWO and Optimizely, adjust for the multiple comparison problem. These tools will make sure that the false positive rate of your experiment matches the false positive rate you think you are getting.
In other words, the false positive rate you set in your significance threshold will reflect the true chance of getting a false positive: you won’t need to correct and adjust the confidence level using the Bonferroni or any other methods.
One final problem with testing multiple variations can occur when you are analyzing the results of your test. You may be tempted to declare the variation with the highest lift the winner, even though there is no statistically significant difference between the winner and the runner up. This means that, even though one variation may be performing better in the current test, the runner up could “win” in the next round.
You should consider both variations as winners.
Mistake #2: You change experiment settings in the middle of a test
When you launch an experiment, you need to commit to it fully. Do not change the experiment settings, the test goals, the design of the variation or of the Control mid-experiment. And don’t change traffic allocations to variations.
Changing the traffic split between variations during an experiment will impact the integrity of your results because of a problem known as Simpson’s Paradox.This statistical paradox appears when we see a trend in different groups of data which disappears when those groups are combined.
Ronny Kohavi from Microsoft shares an example wherein a website gets one million daily visitors, on both Friday and Saturday. On Friday, 1% of the traffic is assigned to the treatment (i.e. the variation), and on Saturday that percentage is raised to 50%.
Even though the treatment has a higher conversion rate than the Control on both Friday (2.30% vs. 2.02%) and Saturday (1.2% vs. 1.00%), when the data is combined over the two days, the treatment seems to underperform (1.20% vs. 1.68%).
This is because we are dealing with weighted averages. The data from Saturday, a day with an overall worse conversion rate, impacted the treatment more than that from Friday.
We will return to Simpson’s Paradox in just a bit.
Changing the traffic allocation mid-test will also skew your results because it alters the sampling of your returning visitors.
Changes made to the traffic allocation only affect new users. Once visitors are bucketed into a variation, they will continue to see that variation for as long as the experiment is running.
So, let’s say you start a test by allocating 80% of your traffic to the Control and 20% to the variation. Then, after a few days you change it to a 50/50 split. All new users will be allocated accordingly from then on.
However, all the users that entered the experiment prior to the change will be bucketed into the same variation they entered previously. In our current example, this means that the returning visitors will still be assigned to the Control and you will now have a large proportion of returning visitors (who are more likely to convert) in the Control.
Note: This problem of changing traffic allocation mid-test only happens if you make a change at the variation level. You can change the traffic allocation at the experiment level mid-experiment. This is useful if you want to have a ramp up period where you target only 50% of your traffic for the first few days of a test before increasing it to 100%. This won’t impact the integrity of your results.
As I mentioned earlier, the “do not change mid-test rule” extends to your test goals and the designs of your variations. If you’re tracking multiple goals during an experiment, you may be tempted to change what the main goal should be mid-experiment. Don’t do it.
All Optimizers have a favorite variation that we secretly hope will win during any given test. This is not a problem until you start giving weight to the metrics that favor this variation. Decide on a goal metric that you can measure in the short term (the duration of a test) and that can predict your success in the long term. Track it and stick to it.
It is useful to track other key metrics to gain insights and/or debug an experiment, if something looks wrong. However, these are not the metrics you should look at to make a decision, even though they may favor your favorite variation.
Let’s say you have avoided the 2 mistakes I’ve already discussed, and you’re pretty confident about the results you see in your A/B testing tool. It’s time to analyze the results, right?
Not so fast! Did you stop the test as soon as it reached statistical significance?
I hope not…
Statistical significance should not dictate when you stop a test. It only tells you if there is a difference between your Control and your variations. This is why you should not wait for a test to be significant (because it may never happen) or stop a test as soon as it is significant. Instead, you need to wait for the calculated sample size to be reached before stopping a test. Use a test duration calculator to understand better when to stop a test.
Now, assuming you’ve stopped your test at the correct time, we can move on to segmentation. Segmentation and personalization are hot topics in marketing right now, and more and more tools enable segmentation and personalization.
There are 2 main problems with post-test segmentation, however, that will impact the statistical validity of your segments (when done incorrectly).
The sample size of your segments is too small. You stopped the test when you reached the calculated sample size, but at a segment level the sample size is likely too small and the lift between segments has no statistical validity.
The multiple comparison problem. The more segments you compare, the greater the likelihood that you’ll get a false positive among those tests. With a 95% confidence level, you’re likely to get a false positive every 20 post-test segments you look at.
There are different ways to prevent these two issues, but the easiest and most accurate strategy is to create targeted tests (rather than breaking down results per segment post-test).
I don’t advocate against post-test segmentation―quite the opposite. In fact, looking at too much aggregate data can be misleading. (Simpson’s Paradox strikes back.)
The Wikipedia definition for Simpson’s Paradox provides a real-life example from a medical study comparing the success rates of two treatments for kidney stones.
The table below shows the success rates and numbers of treatments for treatments involving both small and large kidney stones.
The paradoxical conclusion is that treatment A is more effective when used on small stones, and also when used on large stones, yet treatment B is more effective when considering both sizes at the same time.
In the context of an A/B test, this would look something like this:
Simpson’s Paradox surfaces when sampling is not uniform—that is the sample size of your segments is different. There are a few things you can do to prevent getting lost in and misled by this paradox.
First, you can prevent this problem from happening altogether by using stratified sampling, which is the process of dividing members of the population into homogeneous and mutually exclusive subgroups before sampling. However, most tools don’t offer this option.
If you are already in a situation where you have to decide whether to act on aggregate data or on segment data, Georgi Georgiev recommends you look at the story behind the numbers, rather than at the numbers themselves.
“My recommendation in the specific example [illustrated in the table above] is to refrain from making a decision with the data in the table. Instead, we should consider looking at each traffic source/landing page couple from a qualitative standpoint first. Based on the nature of each traffic source (one-time, seasonal, stable) we might reach a different final decision. For example, we may consider retaining both landing pages, but for different sources.
In order to do that in a data-driven manner, we should treat each source/page couple as a separate test variation and perform some additional testing until we reach the desired statistically significant result for each pair (currently we do not have significant results pair-wise).”
In a nutshell, it can be complicated to get post-test segmentation right, but when you do, it will unveil insights that your aggregate data can’t. Remember, you will have to validate the data for each segment in a separate follow up test.
The execution of an experiment is the most important part of a successful optimization strategy. If your tests are not executed properly, your results will be invalid and you will be relying on misleading data.
It is always tempting to showcase good results. Results are often the most important factor when your boss is evaluating the success of your conversion optimization department or agency.
But results aren’t always trustworthy. Too often, the numbers you see in case studies are lacking valid statistical inferences: either they rely on too heavily on an A/B testing tool’s unreliable stats engine and/or they haven’t addressed the common pitfalls outlined in this post.
Use case studies as a source of inspiration, but make sure that you are executing your tests properly by doing the following:
If your A/B testing tool doesn’t adjust for the multiple comparison problem, make sure to correct your significance level for tests with more than 1 variation
Don’t change your experiment settings mid-experiment
Don’t use statistical significance as an indicator of when to stop a test, and make sure to calculate the sample size you need to reach before calling a test complete
Finally, keep segmenting your data post-test. But make sure you are not falling into the multiple comparison trap and are comparing segments that are significant and have a big enough sample size
Support for responsive images was added to WordPress core in version 4.4 to address the use case for viewport-based image selection, where the browser requests the image size that best fits the layout for its particular viewport.
Images that are inserted within the text of a post automatically get the responsive treatment, while images that are handled by the theme or plugins — like featured images and image galleries — can be coded by developers using the new responsive image functions and filters. With a few additions, WordPress websites can accommodate another responsive image use case known as art direction. Art direction gives us the ability to design with images whose crop or composition changes at certain breakpoints.
You believe that a more customized user experience will lead to more orders, demo requests, phone calls etc. So, you have structures in place to deliver appropriate messages to your different audiences, each with distinct needs and expectations.
But I must ask, how are you segmenting your visitors?
You might be grouping them by device, by traffic source, by demographic data.
And these buckets are all viable:
Your desktop visitors may behave differently than your mobile visitors
Visitors coming from a Facebook ad may respond better to social proof triggers than those coming from organic search
Older visitors may browse your products differently than younger visitors
But the ultimate goal of segmentation, like conversion optimization, is to increase conversions. With that in mind, this post is all about that one segment you probably aren’t looking at: converters versus non-converters.
To clarify, your converter segment is not necessarily the same thing as your repeat-customer or Loyalty segment. Your converter segment includes anyone who converts, whether or not they’ve converted before.
Rather than focusing on different general visitor segments, you should turn your attention to the behaviors that differentiate visitors who convert from visitors who don’t.
When you focus on general visitor segments, you’re working from the top of the funnel to the bottom. Why not work from the bottom of the funnel, up? After all, that’s where the money is!
Correlation vs. Causation
First things first: when you’re looking at differences between converters and non-converters on your site, you must be wary of correlation versus causation.
It’s almost impossible to know whether converters are behaving in a distinct way because they’re already motivated to buy (correlation) or because the elements on the page have enabled those distinct behaviors (causation).
For example, does a converter browse more products than a non-converter because they’re already motivated to buy before arriving on-site? Or does an on-site UI that emphasizes browsability encourage converters to browse (and therefore convert)?
It’s similar to the search bar quandary: typically, visitors who search convert at a higher rate. But do they convert because they search (causation) or do the search because they’re already more motivated to buy (correlation)?
It’s a bit of a “the chicken or the egg” situation.
Fortunately, at WiderFunnel, we’re able to test on many retailers’ websites and take note of certain patterns. On multiple instances with different clients, we have observed clear and drastic differences in key user behavior metrics between visitors who convert and visitors who don’t convert.
These differences paint a picture of how your visitors shop. You can use this information to improve your UX and add features that’ll help your general visitors behave more like converters than non-converters. The hope is that encouraging non-converters to mimic the behavior of converters will lead to them actually becoming converters.
Moral of the story: If you observe impactful differences between converters and non-converters on your site, you should create a hypothesis that targets these differences.
WiderFunnel Optimization Strategist, Nick So, recently ran a test that did just that.
Let’s buy some shoes
One of our biggest clients is a global shoe retailer. Over the past 6 months, Nick noticed some patterns in their analytics:
A high percentage of visitors that convert (like 60%) are returning visitors
Converters visited 186% more pages per session on average and spent more time on page per session than non-converters
Meaning, the majority of converters on this site have already been to the site at least once before and they seem to spend much more time browsing than their non-converting counterparts.
It’s common sense that visitors who convert behave differently than those who don’t. But it wasn’t until we pulled the report and saw how big the difference was in their shopping behavior that we really thought to go down this path.
In previous testing, Nick had also observed that visitors to this site are responsive to features that increase the browsability of multiple products. He’d noticed the same sensitivity with some of our other retailer clients, where features that made it easier to compare products helped conversions.
We decided to run with this data. Our hypothesis was based on the idea that visitors who convert are most likely returning visitors, therefore, pointing them toward products they’ve already viewed will guide them back into the funnel.
The hypothesis: Increasing the browsability of the site by displaying recently viewed products to increase relevance for the visitor will encourage higher engagement and increased return visits, which will increase conversions.
Nick and the team tested a single variation against the Control homepage. The Control featured a “Recommended Products” section just below the hero section, displaying four of the client’s most popular product categories.
In our variation, we replaced this with a “Your Recently Viewed Products” section. We wanted to target those visitors who were returning to the site, presumably to continue in the purchasing process. The products displayed in this section were unique to each returning visitor.
Our variation won, consistently outperforming the Control during this test. This client saw a 6.9% increase in order completions.
Bottom to top
When you’re segmenting your audience, don’t forget about the segment that floats at the bottom of the funnel. Instead of identifying the differences that characterize visitors coming to your site, why not work backwards?
Look at the behavioral differences that distinguish converters from non-converters and test ways to help non-converters mimic the behaviors of converters.
Have you noticed drastic behavioral differences between your visitors who convert and those who don’t convert? Do you tap into this particular segment when you plan tests? Tell us all about it in the comments!