Tag Archives: optimizely

How to do server-side testing for single page app optimization

Reading Time: 5 minutes

Gettin’ technical.

We talk a lot about marketing strategy on this blog. But today, we are getting technical.

In this post, I team up with WiderFunnel front-end developer, Thomas Davis, to cover the basics of server-side testing from a web development perspective.

The alternative to server-side testing is client-side testing, which has arguably been the dominant testing method for many marketing teams, due to ease and speed.

But modern web applications are becoming more dynamic and technically complex. And testing within these applications is becoming more technically complex.

Server-side testing is a solution to this increased complexity. It also allows you to test much deeper. Rather than being limited to testing images or buttons on your website, you can test algorithms, architectures, and re-brands.

Simply put: If you want to test on an application, you should consider server-side testing.

Let’s dig in!

Note: Server-side testing is a tactic that is linked to single page applications (SPAs). Throughout this post, I will refer to web pages and web content within the context of a SPA. Applications such as Facebook, Airbnb, Slack, BBC, CodeAcademy, eBay, and Instagram are SPAs.


Defining server-side and client-side rendering

In web development terms, “server-side” refers to “occurring on the server side of a client-server system.”

The client refers to the browser, and client-side rendering occurs when:

  1. A user requests a web page,
  2. The server finds the page and sends it to the user’s browser,
  3. The page is rendered on the user’s browser, and any scripts run during or after the page is displayed.
Static app server
A basic representation of server-client communication.

The server is where the web page and other content live. With server-side rendering, the requested web page is sent to the user’s browser in final form:

  1. A user requests a web page,
  2. The server interprets the script in the page, and creates or changes the page content to suit the situation
  3. The page is sent to the user in final form and then cannot be changed using server-side scripting.

To talk about server-side rendering, we also have to talk a little bit about JavaScript. JavaScript is a scripting language that adds functionality to web pages, such as a drop-down menu or an image carousel.

Traditionally, JavaScript has been executed on the client side, within the user’s browser. However, with the emergence of Node.js, JavaScript can be run on the server side. All JavaScript executing on the server is running through Node.js.

*Node.js is an open-source, cross-platform JavaScript runtime environment, used to execute JavaScript code server-side. It uses the Chrome V8 JavaScript engine.

In laymen’s (ish) terms:

When you visit a SPA web application, the content you are seeing is either being rendered in your browser (client-side), or on the server (server-side).

If the content is rendered client-side, JavaScript builds the application HTML content within the browser, and requests any missing data from the server to fill in the blanks.

Basically, the page is incomplete upon arrival, and is completed within the browser.

If the content is being rendered server-side, your browser receives the application HTML, pre-built by the server. It doesn’t have to fill in any blanks.

Why do SPAs use server-side rendering?

There are benefits to both client-side rendering and server-side rendering, but render performance and page load time are two huge pro’s for the server side.

(A 1 second delay in page load time can result in a 7% reduction in conversions, according to Kissmetrics.)

Server-side rendering also enables search engine crawlers to find web content, improving SEO; and social crawlers (like the crawlers used by Facebook) do not evaluate JavaScript, making server-side rendering beneficial for social searching.

With client-side rendering, the user’s browser must download all of the application JavaScript, and wait for a response from the server with all of the application data. Then, it has to build the application, and finally, show the complete HTML content to the user.

All of which to say, with a complex application, client-side rendering can lead to sloooow initial load times. And, because client-side rendering relies on each individual user’s browser, the developer only has so much control over load time.

Which explains why some developers are choosing to render their SPAs on the server side.

But, server-side rendering can disrupt your testing efforts, if you are using a framework like Angular or React.js. (And the majority of SPAs use these frameworks).

The disruption occurs because the version of your application that exists on the server becomes out of sync with the changes being made by your test scripts on the browser.

NOTE: If your web application uses Angular, React, or a similar framework, you may have already run into client-side testing obstacles. For more on how to overcome these obstacles, and successfully test on AngularJS apps, read this blog post.


Testing on the server side vs. the client side

Client-side testing involves making changes (the variation) within the browser by injecting Javascript after the original page has already loaded.

The original page loads, the content is hidden, the necessary elements are changed in the background, and the ‘new’ version is shown to the user post-change. (Because the page is hidden while these changes are being made, the user is none-the-wiser.)

As I mentioned earlier, the advantages of client-side testing are ease and speed. With a client-side testing tool like VWO, a marketer can set up and execute a simple test using a WYSIWYG editor without involving a developer.

But for complex applications, client-side testing may not be the best option: Layering more JavaScript on top of an already-bulky application means even slower load time, and an even more cumbersome user experience.

A Quick Hack

There is a workaround if you are determined to do client-side testing on a SPA application. Web developers can take advantage of features like Optimizely’s conditional activation mode to make sure that testing scripts are only executed when the application reaches a desired state.

However, this can be difficult as developers will have to take many variables into account, like location changes performed by the $routeProvider, or triggering interaction based goals.

To avoid flicker, you may need to hide content until the front-end application has initialized in the browser, voiding the performance benefits of using server-side rendering in the first place.

WiderFunnel - client side testing activation mode
Activation Mode waits until the framework has loaded before executing your test.



When you do server-side testing, there are no modifications being made at the browser level. Rather, the parameters of the experiment variation (‘User 1 sees Variation A’) are determined at the server route level, and hooked straight into the javascript application through a service provider.

Here is an example where we are testing a pricing change:

“Ok, so, if I want to do server-side testing, do I have to involve my web development team?”

Yep.

But, this means that testing gets folded into your development team’s work flow. And, it means that it will be easier to integrate winning variations into your code base in the end.

If yours is a SPA, server-side testing may be the better choice, despite the work involved. Not only does server-side testing embed testing into your development workflow, it also broadens the scope of what you can actually test.

Rather than being limited to testing page elements, you can begin testing core components of your application’s usability like search algorithms and pricing changes.

A server-side test example!

For web developers who want to do server-side testing on a SPA, Tom has put together a basic example using Optimizely SDK. This example is an illustration, and is not functional.

In it, we are running a simple experiment that changes the color of a button. The example is built using Angular Universal and express JS. A global service provider is being used to fetch the user variation from the Optimizely SDK.

Here, we have simply hard-coded the user ID. However, Optimizely requires that each user have a unique ID. Therefore, you may want to use the user ID that already exists in your database, or store a cookie through express’ Cookie middleware.

Are you currently doing server-side testing?

Or, are you client-side testing on a SPA application? What challenges (if any) have you faced? How have you handled them? Do you have any specific questions? Let us know in the comments!

The post How to do server-side testing for single page app optimization appeared first on WiderFunnel Conversion Optimization.

Continue reading – 

How to do server-side testing for single page app optimization

11 ways to stop FOOC’ing your A/B tests

Work long enough in Conversion Optimization and you will hear this phrase:

“We tried [insert popular a/b testing tool], but there was a latency issue so we stopped testing.”

In 95% of cases, by “latency issue” they’re referring to the noticeable flicker or flash of the original version of a website before test changes are seen. It even has its own acronym: FOOC (Flash of Original Content)*. Here’s a beautiful example I created on the WiderFunnel home page:

An example of FOOC I created. This is not how you want to be A/B Testing.

Why does FOOC matter?

According to a team of MIT neuroscientists, the human brain can identify images in as little as 13 milliseconds.

FOOC can take longer — from 100 ms up to a whole second. Your website visitors will notice.

Is that always a bad thing? No, as David Hauser of Grasshopper discovered:

“Our A/B testing tool had a bug that delayed the $25 activation fee from being crossed out until a few seconds after the page loaded. This error ended up creating a much larger uplift than having it already crossed out on load, when the bug was fixed. The result now is that the activation fee shows, and then is crossed out after a few seconds.”

Sometimes, FOOC is a good thing as seen on the Grasshopper pricing page. Source: Unbounce

That insight came from a lucky side-effect of the FOOC error, but most times it’s not a good thing.

Whether good or bad, you need to get a handle on your FOOC. It hinders your ability to run controlled experiments. If your variation content is appearing with a flicker every time your page loads, how do you know what effect that’s having on your results?

You don’t, unless you isolate the flicker.

Why does FOOC happen in the first place?

All client-side A/B testing tools are inherently susceptible to FOOC.

“Client-side” means that the changes in your test are being applied as your web page loads, through a tag on your website. This is how the most popular tools on the market do it.

AB testing Optimizely snippet A diagram showing how Optimizely’s snippet works. Source: Optimizely

The client-side nature of these tools is what makes them so easy to get started with: a solo marketer has the ability to launch experiments without the need for a development team. It’s what makes the WYSIWYG Visual Editor a reality.

But that selling point comes at a price. For page changes to occur, a couple things must happen: the snippet of the tool must load and the page elements being modified must load. If either takes too long, your A/B test is in the gutter.

Luckily for us all, there are ways around the challenges of client-side tools.

Follow the eleven tips below, and even if you’re a noob jQuery dabbler, you’ll be able to launch FOOC-free experiments.

1. Speed up your website

Besides being one of the proven ways to increase conversion rates, speeding up your website is a first step in helping prevent flickering or long waits during A/B tests. My favorite tool for this has always been WebpageTest.org. Simple, free, effective. Have your front-end development team look into some of the issues and track performance over time.

abtesting-fooc-5Continue to check your site over time, as small changes can have a big impact on speed.

2. Put your snippet where it needs to go

I’ve seen snippets in footers and I’ve seen them served via Google Tag Manager. Don’t do either. For example, Optimizely’s needs to go as high up as possible in the <head>.

abtesting-fooc-6Whenever possible, move your snippet up to the top of your <head>, assuming you have trimmed jQuery in the snippet.

The drawback is that, yes, Optimizely will be adding a few milliseconds of load time to your pages when loaded for the first time. We haven’t found it to be an issue unless the remaining suggestions aren’t followed.

3. Reduce the size of your snippet

Archive any paused experiments and draft experiments that you don’t need the preview links to and load only a trimmed version of jQuery (this is especially important when loading your snippet at the top of your <head> tag). This will reduce the size of the snippet being loaded on your website, mostly affecting first time visitors.

abtesting-fooc-7

Archive those experiments taking up space in your snippet.

4. Roll up hotfixes

If you’re using your testing tool as a way to make fixes to your website, roll those changes up into project code rather than running them in a separate experiment. If you’re one of many who don’t have access to project-level code, then implement that code along with your current experiment.

abtesting-fooc-8Put “hotfix” code into your project code rather than in an individual experiment.

5. Order your variation code to match your website code

If you’re changing something at the top of your web page, position that change at the top of your variation code. jQuery waits until it finds the element on the page to make the change. If that element comes earlier than later, it will move on to the next line.

This way the content at the top of your website gets changed as quickly as possible.

abtesting-fooc-9

If using jQuery in your variation code, order it so that you’re making changes in the same order that elements load on your website.

6. Consolidate your variation code

If you want to up the size of your headline and change the color, do so in one swift line. If you decide later that you want to reduce the size of the headline, update your existing code rather than adding another line of code to make the reduction in size.

abtesting-fooc-10Group changes into one and remove unused changes.

In conjunction to consolidating code, when making changes via the Visual Editor, keep the scope of your changes to the most specific HTML element possible. Rather than selecting “#mainbody” to modify the attributes of a sub-element, select that sub-element to begin with.

7. Temporarily hide the <body>

No matter how fast your website is, if your original content is loading before your variation code has time to run, you will experience FOOC. To get around this, you’ll need to quickly hide, then show the <body> of your page.

In your experiment-level JavaScript, force Optimizely to run the following:

abtesting-fooc-11Hide the body of the page as quickly as possible by forcing it at the experiment-level.

This hides the <body> as fast as possible, assuming you’ve placed the snippet at the top of the <head>. Then, in your variation code, put a fail-safe (say 3 seconds) to show the body again if something goes wrong.

Insert your variation code after that.

Finally, make the body visible again. Note the 500 millisecond timer on this one. Keep it as low as possible, just enough to avoid a flicker. After all, FOOC is still better than a really slow loading website.
abtesting-fooc-12

Be sure to customize your timers to make sense for your website and the test you’re running.

This gets rid of any flashing of original content (assuming your snippet is not loading asynchronously or too late on the page). The potential drawback is a perceived slowness of the website on first load. That’s why you set a timer to make sure the body is shown before a set threshold.

8. Learn front-end development fundamentals

For those of us who never made it past the “Hello World” lesson in JavaScript 101, it’s a good idea to round out your front-end development knowledge. You don’t need to become a coder, you just need to be able to understand how it works.

It takes no more than a weekend to learn the basics of HTML, CSS, JavaScript and jQuery — the building blocks of DOM manipulation. Head to a free (and fun!) resource like Codecademy to get started.

abtesting-fooc-13Brush up on your front-end development.


Starting here, most of us will need a front-end developer’s help (I’ll admit, I got help from our dev team for this part). If that’s not an option, don’t worry: with the tips above, you should be able to launch FOOC-free tests. Like this article so far? Let me know! 


Now on to steps 9 through 11:

9. Use CSS as much as possible

By default, Optimizely and VWO visual editors produce your edits via jQuery, even for simple things like changing colors. CSS is a faster way to go, whenever feasible.

Instead of this:

abtesting-fooc-14
Do this in the Edit Code window:

abtesting-fooc-15
And add this to the Experiment CSS (or an external stylesheet):

abtesting-fooc-16
10. Cache your selectors

The DOM is slow. To avoid making redundant calls to it, store selectors you’ll be re-using as jQuery objects. In the example below, 3 changes are being made to the same selector.

abtesting-fooc-17Cache selectors you’ll be re-using to avoid going back into the DOM.

11. Code your variations in raw JavaScript

A/B testing visual editors spit jQuery into the code window. jQuery was created to overcome cross-browser issues and save development time. It’s a library of JavaScript shortcuts.

To change the background color of an element in jQuery it goes something like this:

$('.cta').css('background', 'red');

Now the same thing in raw JavaScript:

document.querySelectorAll(".cta")[0].style.backgroundColor = "red";

While the development time savings is significant, it comes at a cost. As with any JavaScript library, jQuery code runs slower than raw JavaScript.

How much slower? Depends on what you’re doing. Without much digging, I found a non-scientific test that resulted in a difference of 60x in performance between jQuery and JavaScript. It’s not significant evidence, but it points to potential speed gains.

Coding some or parts of your variations in raw JavaScript also means that your dev team will have to put in extra time to produce your A/B tests. You’ll want to strike a balance between improving code efficiency and productivity. For more on the topic of JavaScript and jQuery, I urge you to check out this very informative thread on StackExchange.

If you’re using one of the newer schmancy JavaScript frameworks, there are options for writing variation code. Here are some resources to help:

Watch the accompanying Opticon presentation here.

  • Optimizely has published quite a bit on how to deal with sites using angular or other single page app-type situations.

Are those all of the ways to reduce the chances of FOOC? Certainly not. Feel free to add suggestions or questions in the comments below. We can make this an AMA of sorts, regarding FOOC.

FAQs about FOOC

  • Can I use asynchronous loading to avoid FOOC? You can try, but it probably won’t work. Asynchronous loading addresses a separate issue: helping with overall site speed, not FOOC. Given the speed of modern CDNs, snippets loading synchronously should be the least of your concerns. But, if you’re like our neighbors here in Vancouver, PlentyOfFish, with a bajillion users hitting their site at the same time, you may want to be considerate of what and how things load on your pages.
  • Can I use a server-side / proxy testing tool to avoid FOOC? You could, but say good-bye to most of the benefits of a client-side tool.
  • I noticed a major slow down when I added XYZ A/B testing tool on my website. Should I switch to a more popular tool like Optimizely or VWO? Perhaps. There are some tools out there that don’t use distributed CDNs and that include jQuery by default in their snippet. Yes, some will slow down your website.

PS: If you’re an Optimizely power-user, consider checking out a project by WiderFunnel Labs, Liftmap, a great way to increase your A/B testing velocity by managing your CRO workflow.

* As opposed to Flash of Unstyled Content, which refers to a separate problem, usually unrelated to A/B testing.

The post 11 ways to stop FOOC’ing your A/B tests appeared first on WiderFunnel.

Link:   11 ways to stop FOOC’ing your A/B tests

Goodbye, t-test: new stats models for A/B testing boost accuracy, effectiveness

The t-test has served as a workhorse for conversion optimization teams for many years — though there’s always been confusion about what the results really mean.

This statistical method that has driven A/B testing analyses has served us well, but it’s clear that the t-test is now outdated. With websites receiving and harvesting data constantly, the t-test is simply too slow and isn’t able to give teams the updated results they need to make fast business decisions.

Plus, typical t-test results reports have often been misleading, as you can see in the example below.
(Note: Chris Goward shows how to interpret t-test results here.)

This is what results used to look like with the classic t-test. A few minutes into an a/b test and we’ve already got a “winner”… right. Optimizely took the first step earlier this year in revolutionizing how a/b test results are calculated.
This is what results used to look like with the classic t-test. A few minutes into an a/b test and we’ve already got a “winner”… right. Optimizely took the first step earlier this year in revolutionizing how a/b test results are calculated.

Fortunately, there are now solutions. Optimizely and VWO have both developed new models for analyzing A/B testing and conversion results that can help businesses keep up with the flood of data they get from their websites. Here’s what you need to know.

The background

Back in 2007, Google launched its free Website Optimizer tool, which made A/B testing for websites much more accessible. It used the t-test, which looks at differences between means relative to spread or variability, to determine which variation was the winner or loser.
The launch inspired other optimization platforms. Optimizely and VWO emerged as the top two in a digital arms race. Features such as WYSIWYG editors made it easy to create website variations, while marketing software integration and mobile capabilities made the tools even more useful.

Why the change?

There are several reasons the t-test became outdated.

Leonid Pekelis
Leonid Pekelis
  • It simply isn’t made for the job. “The original test was meant to be run in the field with an end-point to collecting data”, says Optimizely’s in-house statistician, Leo Pekelis. “You set up your hypothesis, you gather data, you see if your data has evidence, then you report your results. So it’s a very compartmentalized and linear fashion of how to deal with test results.” The error rates calculated are for that procedure, not the way people collect a/b data now.
  • The t-test, as used traditionally in A/B testing, is plagued by false discoveries, especially when looking at results continuously — which is, of course, what conversation optimization teams do. The problem is that the results of an on-going t-test may appear to be conclusive earlier than when the results should be looked at.
  • The amount of tests run per month can number in the hundreds on high-power conversion optimization teams. The t-test has a potential to produce too many false positives, even with the right sample size of visitors. Meanwhile, major business decisions are waiting on the results.

What are the new A/B testing options?

New solutions from Optimizely and VWO have a similar goal: Put the t-test out to pasture, in favor of more useful and immediate statistical models that favor the demands of digital A/B testing.

Optimizely

Optimizely’s Stats Engine uses sequential testing. It’s better suited for A/B testing, because it evaluates results as data comes in, instead of  “fixed horizon testing” using a fixed sample size. Doing so brings a Bayesian view that creates a likelihood ratio, a model for how the variation is expected to perform.

Optimizely's stats engine.
Optimizely’s stats engine.

With Optimizely, you can look at your results at any time, and have more confidence in their reliability. You’re far less likely to see a variation falsely declared a winner. Optimizely says this is possible without sacrificing speed in most cases, especially when the actual lift in a variation is higher than what you would have set as your minimum detectable effect.
“When it comes to speed vs. accuracy in statistics, there’s no such thing as a free lunch,” says Pekelis, “We built Stats Engine to be as quick as possible while maintaining the integrity of our results.”

Visual Website Optimizer

Chris Stucchio
Chris Stucchio

VWO’s solution, SmartStats, is a little newer. It’s been in beta since August 2015 and just launched this month. This option goes full Bayesian: It gives a range, rather than a specific conversion rate, and becomes tighter as time goes on. Its main goal is to identify a best performer as fast as possible, rather than determine the variation of winning. “What the client wants is an exact number,” says Chris Stucchio, VWO’s in-house statistician. “But I’m a statistician, so I’m just going to be honest and say you cannot have an exact number — there isn’t one. All you can get is a range.” If you wait for an exact number, you sacrifice conversions.

In addition, SmartStats shows how much you could stand to lose if the variation performs at its minimum threshold conversion rate. It also has built-in trigger alerts to reduce methodological errors: The solution notifies you, for example, if you change your audience targeting mid-experiment or fail to run your experiment a full integer number of weeks.

Campaign warnings and campaign pause

What’s the difference between SmartStats and Stats Engine?

I was still curious: what would results look like if you ran the exact same experiment in both VWO and Optimizely at the same time? Which would give you a report faster and which would be more precise about the result?

According to Stucchio, VWO would likely give you the first result that says B is not worse than A and is probably better. “Some time after this, Stats Engine will say, ‘Yeah, you should probably deploy B.’ In the meantime, if you didn’t stop the test in VWO and you looked in our tool, you would discover that our credible intervals instead of saying 0-10 percent, they might say 4-7 percent.”

On the other hand, Pekelis points to the risk of average error control. “A fully Bayesian method will tend to make statements like ‘B is not worse than A’ sooner than Stats Engine, but this is problematic in one very important case: when B is actually worse than A,” he says.

Thomas Bayes
Thomas Bayes

“The rate of making a wrong call is controlled for Bayesian methods on average, meaning that for some experiments it will be higher and some it will be lower. The problem with this is customers who have more of the higher error experiments will see more mistakes than anticipated. Making a call as early as possible exacerbates this, and that’s the cost – some customers will be exposed to potentially a lot more errors. Are there cases where average error control is a good idea? Sure. But our philosophy is to explore solutions starting from a position of rigorous accuracy and clearly communicate any tradeoffs.

Pekelis suggests that the best way to speed up a test in Stats Engine is to lower the significance threshold, and accept a higher error rate. “We felt a known cost was better than an unknown one,” he says.

So what it comes down to is that with either tool, you’re going to be able to get more reliable results faster than before, but you’ll need to decide how much risk you’re willing to take in order to expedite your results. As always, you need to be scientific when it comes to your methodology. These tools will not make up for poor experiment design.

For those of you who’ve been calculating required sample sizes beforehand, your tests will now have the chance to end faster. This means that if it turns out that your test outperforms what your minimum detectable effect would have been had you used a classic t-test sample size calculator, you’ll be able to move on to your next test a lot sooner.

What also deserves mention is how much easier this evolution makes communicating results to stakeholders or clients. Whereas, before, results would look definitive with stakeholders asking why action wasn’t being taken, now, results will only look definitive when it is time. Imagine all of the painful conversations you can now avoid!

Here’s a comparison chart to compare the slight differences in how the two companies approach the solution.

Stats model comparison table

In the end, they’re both improvements to the old t-test and are great improvements to help optimization champions do better work.


Keep reading. Download our white paper, Developing a successful and scalable conversion optimization strategy.

About WiderFunnel
WiderFunnel creates profitable ‘A-ha!’ moments for clients. Our team of optimization experts works together with a singular focus: conversion optimization of our client’s customer touchpoints through insightful A/B testing. We don’t just consult and give advice — we test every recommendation to prove its value and gain tested insights. Contact us to learn more.

The post Goodbye, t-test: new stats models for A/B testing boost accuracy, effectiveness appeared first on WiderFunnel.

This article: 

Goodbye, t-test: new stats models for A/B testing boost accuracy, effectiveness