Tag Archives: given

Thumbnail

Building A Central Logging Service In-House




Building A Central Logging Service In-House

Akhil Labudubariki



We all know how important debugging is for improving application performance and features. BrowserStack runs one million sessions a day on a highly distributed application stack! Each involves several moving parts, as a client’s single session can span multiple components across several geographic regions.

Without the right framework and tools, the debugging process can be a nightmare. In our case, we needed a way to collect events happening during different stages of each process in order to get an in-depth understanding of everything taking place during a session. With our infrastructure, solving this problem became complicated as each component might have multiple events from their lifecycle of processing a request.

That’s why we developed our own in-house Central Logging Service tool (CLS) to record all important events logged during a session. These events help our developers identify conditions where something goes wrong in a session and helps keep track of certain key product metrics.

Debugging data ranges from simple things like API response latency to monitoring a user’s network health. In this article, we share our story of building our CLS tool which collects 70G of relevant chronological data per day from 100+ components reliably, at scale and with two M3.large EC2 instances.

The Decision To Build In-House

First, let’s consider why we built our CLS tool in-house rather than used an existing solution. Each of our sessions sends 15 events on average, from multiple components to the service – translating into approximately 15 million total events per day.

Our service needed the ability to store all this data. We sought a complete solution to support event storing, sending and querying across events. As we considered third-party solutions such as Amplitude and Keen, our evaluation metrics included cost, performance in handling high parallel requests and ease of adoption. Unfortunately, we could not find a fit that met all our requirements within budget – although benefits would have included saving time and minimizing alerts. While it would take additional effort, we decided to develop an in-house solution ourselves.


Building in-house


One of the biggest issues with building In-house is the amount of resources that we need to spend to maintain it. (Image credit: Source: Digiday)

Technical Details

In terms of architecting for our component, we outlined the following basic requirements:

  • Client Performance
    Does not impact the performance of the client/component sending the events.
  • Scale
    Able to handle a high number of requests in parallel.
  • Service performance
    Quick to process all events being sent to it.
  • Insight into data
    Each event logged needs to have some meta information to be able to uniquely identify the component or user, account or message and give more information to help the developer debug faster.
  • Queryable interface
    Developers can query all events for a particular session, helping to debug a particular session, build component health reports, or generate meaningful performance statistics of our systems.
  • Faster and easier adoption
    Easy integration with an existing or new component without burdening teams and taking up their resources.
  • Low maintenance
    We are a small engineering team, so we sought a solution to minimize alerts!

Building Our CLS Solution

Decision 1: Choosing An Interface To Expose

In developing CLS, we obviously didn’t want to lose any of our data, but we didn’t want component performance to take a hit either. Not to mention the additional factor of preventing existing components from becoming more complicated, since it would delay overall adoption and release. In determining our interface, we considered the following choices:

  1. Storing events in local Redis in each component, as a background processor pushes it to CLS. However, this requires a change in all components, along with an introduction of Redis for components which didn’t already contain it.
  2. A Publisher – Subscriber model, where Redis is closer to the CLS. As everyone publishes events, again we have the factor of components running across the globe. During the time of high-traffic, this would delay components. Further, this write could intermittently jump up to five seconds (due to the internet alone).
  3. Sending events over UDP, which offers a lesser impact on application performance. In this case data would be sent and forgotten, however, the disadvantage here would be data loss.

Interestingly, our data loss over UDP was less than 0.1 percent, which was an acceptable amount for us to consider building such a service. We were able to convince all teams that this amount of loss was worth the performance, and went ahead to leverage a UDP interface that listened to all events being sent.

While one result was a smaller impact on an application’s performance, we did face an issue as UDP traffic was not allowed from all networks, mostly from our users’ – causing us in some cases to receive no data at all. As a workaround, we supported logging events using HTTP requests. All events coming from the user’s side would be sent via HTTP, whereas all events being recorded from our components would be via UDP.

Decision 2: Tech Stack (Language, Framework & Storage)

We are a Ruby shop. However, we were uncertain if Ruby would be a better choice for our particular problem. Our service would have to handle a lot of incoming requests, as well as process a lot of writes. With the Global Interpreter lock, achieving multithreading or concurrency would be difficult in Ruby (please don’t take offense – we love Ruby!). So we needed a solution that would help us achieve this kind of concurrency.

We were also keen to evaluate a new language in our tech stack, and this project seemed perfect for experimenting with new things. That’s when we decided to give Golang a shot since it offered inbuilt support for concurrency and lightweight threads and go-routines. Each logged data point resembles a key-value pair where ‘key’ is the event and ‘value’ serves as its associated value.

But having a simple key and value is not enough to retrieve a session related data – there is more metadata to it. To address this, we decided any event needing to be logged would have a session ID along with its key and value. We also added extra fields like timestamp, user ID and the component logging the data, so that it became more easy to fetch and analyze data.

Now that we decided on our payload structure, we had to choose our datastore. We considered Elastic Search, but we also wanted to support update requests for keys. This would trigger the entire document to be re-indexed, which might affect the performance of our writes. MongoDB made more sense as a datastore since it would be easier to query all events based on any of the data fields that would be added. This was easy!

Decision 3: DB Size Is Huge And Query And Archiving Sucks!

In order to cut maintenance, our service would have to handle as many events as possible. Given the rate that BrowserStack releases features and products, we were certain the number of our events would increase at higher rates over time, meaning our service would have to continue to perform well. As space increases, reads and writes take more time – which could be a huge hit on the service’s performance.

The first solution we explored was moving logs from a certain period away from the database (in our case, we decided on 15 days). To do this, we created a different database for each day, allowing us to find logs older than a particular period without having to scan all written documents. Now we continually remove databases older than 15 days from Mongo, while of course keeping backups just in case.

The only leftover piece was a developer interface to query session-related data. Honestly, this was the easiest problem to solve. We provide an HTTP interface, where people can query for session related events in the corresponding database in the MongoDB, for any data having a particular session ID.

Architecture

Let’s talk about the internal components of the service, considering the following points:

  1. As previously discussed, we needed two interfaces – one listening over UDP and another listening over HTTP. So we built two servers, again one for each interface, to listen for events. As soon as an event arrives, we parse it to check whether it has the required fields – these are session ID, key, and value. If it does not, the data is dropped. Otherwise, the data is passed over a Go channel to another goroutine, whose sole responsibility is to write to the MongoDB.
  2. A possible concern here is writing to the MongoDB. If writes to the MongoDB are slower than the rate data is received, this creates a bottleneck. This, in turn, starves other incoming events and means dropped data. The server, therefore, should be fast in processing incoming logs and be ready to process ones upcoming. To address the issue, we split the server into two parts: the first receives all events and queues them up for the second, which processes and writes them into the MongoDB.
  3. For queuing we chose Redis. By dividing the entire component into these two pieces we reduced the server’s workload, giving it room to handle more logs.
  4. We wrote a small service using Sinatra server to handle all the work of querying MongoDB with given parameters. It returns an HTML/JSON response to developers when they need information on a particular session.

All these processes happily run on a single m3.large instance.


CLS v1


CLS v1: A representation of the system’s first architecture. All the components are running on one single machine.

Feature Requests

As our CLS tool saw more use over time, it needed more features. Below, we discuss these and how they were added.

Missing Metadata

Gradually as the number of components in BrowserStack increases, we’ve demanded more from CLS. For example, we needed the ability to log events from components lacking a session ID. Otherwise obtaining one would burden our infrastructure, in the form of affecting application performance and incurring traffic on our main servers.

We addressed this by enabling event logging using other keys, such as terminal and user IDs. Now whenever a session is created or updated, CLS is informed with the session ID, as well as the respective user and terminal IDs. It stores a map that can be retrieved by the process of writing to MongoDB. Whenever an event that contains either the user or terminal ID is retrieved, the session ID is added.

Handle Spamming (Code Issues In Other Components)

CLS also faced the usual difficulties with handling spam events. We often found deploys in components that generated a huge volume of requests sent to CLS. Other logs would suffer in the process, as the server became too busy to process these and important logs were dropped.

For the most part, most of the data being logged were via HTTP requests. To control them we enable rate limiting on nginx (using the limit_req_zone module), which blocks requests from any IP we found hitting requests more than a certain number in a small amount of time. Of course, we do leverage health reports on all blocked IPs and inform the responsible teams.

Scale v2

As our sessions per day increased, data being logged to CLS was also increasing. This affected the queries our developers were running daily, and soon the bottleneck we had was with the machine itself. Our setup consisted of two core machines running all of the above components, along with a bunch of scripts to query Mongo and keep track of key metrics for each product. Over time, data on the machine had increased heavily and scripts began to take a lot of CPU time. Even after trying to optimizing Mongo queries, we always came back to the same issues.

To solve this, we added another machine for running health report scripts and the interface to query these sessions. The process involved booting a new machine and setting up a slave of the Mongo running on the main machine. This has helped reduce the CPU spikes we saw every day caused by these scripts.


CLS v2


CLS v2: A representation of the current system’s architecture. Logs are written to the master machine and they are synced on the slave machine. Developer’s queries run on the slave machine.

Conclusion

Building a service for a task as simple as data logging can get complicated, as the amount of data increases. This article discusses the solutions we explored, along with challenges faced while solving this problem. We experimented with Golang to see how well it would fit with our ecosystem, and so far we have been satisfied. Our choice to create an internal service rather than paying for an external one has been wonderfully cost-efficient. We also didn’t have to scale our setup to another machine until much later – when the volume of our sessions increased. Of course, our choices in developing CLS were completely based on our requirements and priorities.

Today CLS handles up to 15 million events every day, constituting up to 70 GB of data. This data is being used to help us solve any issues our customers face during any session. We also use this data for other purposes. Given the insights each session’s data provides on different products and internal components, we’ve begun leveraging this data to keep track of each product. This is achieved by extracting the key metrics for all the important components.

All in all, we’ve seen great success in building our own CLS tool. If it makes sense for you, I recommend you consider doing the same!

Smashing Editorial
(rb, ra, il)


Jump to original:  

Building A Central Logging Service In-House

Thumbnail

How To Deliver A Successful UX Project In The Healthcare Sector




How To Deliver A Successful UX Project In The Healthcare Sector

Sven Jungmann & Karolin Neubauer



A mid-career UX researcher was hired to understand the everyday needs, perceptions, and concerns of patients in a hospital in Berlin, Germany. She used rigorous observation and interviewing methods just like she teaches them to design thinking students at a nearby university. She returned with a handful of actionable insights that our product team found useful, somewhat at least.

However, we were surprised that her recommendations gravitated towards convenience issues such as “Patients want to know the food menu” or “Users struggle to remember who their doctors are.” Entirely missing were reports of physical and psychological complaints. We would at least have expected sleeping problems: Given that 80% of working Germans don’t sleep well and nearly 10% even appear to have severe sleeping disorders (link in German), why did no one mention it?

“We only see what we know.”

— Johann Wolfgang von Goethe (1749-1832)

If you are a UX researcher about to embark on a project with hospitalized patients and you want to avoid missing out on deep concerns and problems of users, then maybe this article can help you strengthen your awareness for particular challenges of clinical UX.

It is difficult to get in touch with real patients and get permission from clinical staff to access the right people. We are fortunate to have access to a network of more than 100 hospitals in Germany thanks to our sister Helios Kliniken GmbH, which is Europe’s largest private hospital provider. Our experience with clinical UX research taught us the importance of stressing that we cannot assume that patients bring up relevant concerns by themselves.

We describe three reasons we think are important to improve the quality and quantity of your findings. Generally speaking, this article emphasizes the need for UX practitioners to be mindful of their participants’ emotional and physical state. But we also discuss how we think UX researchers should prepare for and conduct a research project in the healthcare sector.

The Three B’s That Complicate UX Research In Hospitals

We’ve been thinking much about user research in hospital settings recently. Our new company, smart Helios, a digital health development firm, is a spin-off of Helios Kliniken, Europe’s largest private hospital chain. To inform our lean software development, we thoroughly embrace iterative empathetic observation and end-user interviews at each step of a development cycle (i.e., ideation, prototyping, and testing).

We learned that qualitative research in a hospital setting brings its challenges. We think it is worth considering them, especially those we discuss here, called the three B’s: Biases, Barriers, and Background.

Here’s an overview:

  1. Biases
    Psychological mechanisms can affect our patients’ thinking and hence diminish the results of our findings.
  2. Barriers to trust
    Patients rarely share intimate needs with a non-medical interviewer easily. This can create blind spots in our research.
  3. Background
    Internal and external factors such as wealth, socio-economic status, sanitation, education, and access to healthcare can influence our health as well as the care patients receive. The quality of care also depends on the hospital’s infrastructure and staff experience with a particular disease or procedure. These differences make it challenging to generalize findings.

We also discuss remedies that can help overcome these challenges and distill more valuable insights. They include:

  • Conducting a thorough and well-prepared interview.
  • Including healthcare providers in the process.
  • Drawing on quantitative data to guide some of the qualitative user research.

Biases: We Are Fantastic At Lying To Ourselves, Whether Or Not We’re Patients

Psychologists call them cognitive biases: they negatively affect how accurately we perceive and remember events or feelings, and we possess impressive amounts of them.

For example, people remember information better if it is more recent or salient. If you ask a patient about the initial appointment with her oncologist (this is rarely good news), don’t be surprised if she remembers only a quarter of it. Hospitalized patients are typically overwhelmed by the unfamiliar situation they’re in and often under stress and that influences their memory.

Hence it makes a difference when you ask them in their patient journey. Even in one day, patients face different problems that affect what’s top of mind at the moment you observe or interview them:

  • In the morning, when the painkillers have worn off during sleep, regaining control over physical pain is all that matters.
  • During the day, worries about upcoming procedures might dominate their thoughts or they could be focused on their hunger while they’re not allowed to eat or drink ahead of a diagnostic test.
  • In the evening, they might be afraid that they won’t be able to update their relatives appropriately.
  • At night, some struggle to sleep because of the hospital noise or their worries.

Takeaways:

  • Try interviewing users at different times of the day and different moments of their journey and take note of how findings vary.
  • Always get an orientation about where your users stand within their patient journey. Explore what happened in the previous hours or days and what diagnostics or treatments lie ahead of them.
  • Beware that our psyches possess a plethora of mechanisms to limit rationality and prevent past events from entering our conscious mind. You cannot control them all, but it improves your research if you notice them.

Barriers To Trust: Who Is Asking Matters, Too

It’s not just about how and when; it also matters who is asking. Even if a patient agrees to speak with us, what she shares will highly depend on how much she trusts us. One of us observed as a clinician how often his patients need to build up trust, sometimes over days until they ‘confess’ certain concerns. That’s especially true when problems have a psychological component (e.g., sleeping disorders) or are stigmatized (e.g., certain infectious diseases).

The more knowledgeable you are about health problems, the better you can steer interviews towards relevant issues. If you do this empathetically, your interviewees might find it easier to speak about them. Don’t get frustrated, however, if they don’t. Some people need a lot of trust, and there’s rarely a shortcut to earning it. In these cases, there’s something else you can do: include subjects who are not your target users but still have crucial insights in your interviews.

Take the nurse, for example. She might know from her previous night shift how many patients had trouble sleeping and who might agree to talk about it. The doctors will know which crucial questions they get asked frequently. And the housekeeping staff can share stories about the patients’ hygiene concerns. Listen closely to them: many of the staffs’ pain points likely hint to patients’ pain points, too. The more observers you allow to shed light on one subject, the better your chances to understand your patients’ experiences and the broader your perception of the system will be.

Takeaways:

  • Try to interview people involved in your users’ care.
  • Ask the clinical staff for guidance on which patients to ask about specific problems.
  • Even if patients are your primary users, make sure to ask health care providers, such as doctors, nurses, or therapists about common patient needs.
  • Schedule repeated interviews with patients if possible to build the trust necessary for sharing critical concerns.

The ‘living’ patient journey map.


A large, printed and ‘living’ patient journey helps to constantly challenge and refine assumptions. (Large preview)

Background: Mind The Worlds In And Around The Patients

Different patients have different needs. This is obvious, and part of the reason why UX researchers develop personas, conduct semi-structured interviews, focus group discussions, and observe subjects to explore the multiple realities of our participants. But there’s still a problem we can’t dismiss. Even within our hospitals inside Germany, people’s realities can differ greatly depending on their income, education, insurance, place of residence, etc.

What you discover in one hospital or region might not apply elsewhere. This can be a problem if you want to deploy your products at scale. If you have the opportunity, you should go the extra mile to conduct UX research in different hospitals to understand the needs and what drives adoption.

If some regions show poor sales, conduct field research in these regions to revisit your personas. In fact, don’t just challenge the personas, also explore the environment.

Here’s an example of why that matters: We developed a tool that relies on data from a hospital information system. This worked well in one clinic were staff was spread over different parts of the building, turning digital communication into an effective medium. In another hospital, however, the relevant people sat in the same room, making face-to-face communication significantly better than typing information into electronic records.

Takeaways:

  • Go beyond personas or archetypes and seek to understand the different realities between hospitals, wards, and regions.
  • Develop and constantly improve a rollout-playbook that lists local challenges and how you solved them to inform future expansions.

Sorry, Ignorance Is Not Bliss

Some UX researchers seem to believe that it is beneficial to enter an interview unprepared to avoid bias. Some shy away from acquiring relevant medical knowledge, thinking that (since they are not trained medical professionals) they won’t grasp the concepts anyway. Some feel that understanding the medical context of their interviewees’ situation will not add value to their research since they solely focus on the subjective experience of the disease.

But in evidence-based healthcare, conducting research on patients without previous peer-reviewed literature and guideline research is not only unprofessional but often even considered unethical. We should not fall prey to the illusion that ignorance frees us from bias.

The good news is that in healthcare, we are privileged to have a large body of well-conducted studies and systematic reviews available online, many of them free to access. PubMed is an excellent and open source, tutorials on how to use are abundantly available online (we think this is a good primer). Or if you have the budget, paid sites like UpToDate provide comprehensive disease reviews written both for professionals and for laypeople.

We know that UX researchers who are rightfully focused on ‘getting out of the building’ might not enjoy spending many hours on literature research but we are convinced that this will help you form better hypotheses and questions.

Moreover, if you start with clearly predefined research questions and seek answers in the scientific medical literature, you might save time and discover questions that you wouldn’t have thought of. For example, it is advised that individuals undergoing hip surgery should practice using crutches before the operation because it is already difficult enough even without the postoperative pain and swelling. This knowledge, obtained from literature research, could help move from more open questions, such as:

  • “How do patients prepare for a hip replacement surgery?” or
  • “What perceived needs to patients have ahead of hip surgery?”

To more concrete questions such as:

  • “What are the most important preparatory measures that many patients are currently unaware of?”

Takeaway:

  • Do a systematic literature review to inform your research.

Let The Data Guide You And You Guide The Data

To take this further, we’re also developing methods to use quantitative analytics and Deep Learning to guide our qualitative research. Our machine learning engineer just deployed AI to crawl the web for colon cancer blogs to identify hot topics that remained unmentioned in qualitative reviews. We defined “hot” as having many views, many comments, and many likes.

Or you can uncover semantic structures (see picture). These findings can then guide the UX researchers. Similarly, qualitative research can yield hypotheses that we can try to validate with passively collected data. For example, if you think that sleeping problems are common, you could (user consent provided) use your app to measure phone use at night as a proxy for sleeplessness.


Informing our researchers about hot topics in colon cancer using machine learning (topic modeling, credits Yuki Katoh and Ellen Hoeven)


Deploying topic modeling to find semantic structures in texts on colon cancer to inform our qualitative researchers. (Large preview)

Conclusion

Many of our suggestions are not new to well-trained UX researchers. We are aware of that. But in our experience, it is worth stressing the importance of mindfulness towards the three Bs: Biases, Barriers to trust, and Background. Here’s a summary of some of the recommendations to overcome the 3 Bs:

  • Prepare interviews with literature research on the topic (e.g., on Pubmed.gov).
  • Ask doctors which patients are suitable for interview.
  • Include those who care for your users, including nurses, therapists, and relatives.
  • Cooperate with the data scientists or web analysts in your team, if you have them.
  • Understand that it takes users time to build trust to tell you about some needs.
  • Explore how realities differ not only by personas, but also by regions and hospitals.
  • Stay aware that, no matter how much you try, the influence of the 3Bs can only be reduced, and not entirely removed.

We wish you well and thank you for making the world a healthier place.


The authors would like to thank their former colleague, Tim Leinert, for his thoughtful input to this piece.

Smashing Editorial
(cc, ra, yk, il)


See original article here: 

How To Deliver A Successful UX Project In The Healthcare Sector

Thumbnail

Maximizing The Design Sprint

Following a summer of Wonder Woman, Spiderman, and other superhero blockbusters, it’s natural to fantasize about having a superpower of your own. Luckily for designers, innovators, and customer-centric thinkers, a design sprint allows you to see into the future to learn in just five days what customers think about your finished product.
As a UX consultant and in-house design strategist, I have facilitated dozens upon dozens of design workshops (ranging from rapid prototyping sessions to, of course, sprints).

View post: 

Maximizing The Design Sprint

Hacking Your User Onboarding Process: 10 Strategies to Try Now

There is some confusion about what the overall objective of the onboarding process should be. Many view onboarding as an opportunity to get new users up and running. However, I prefer to think of it as laying the foundation for a long-term relationship with your brand or application. Particularly for software companies, it’s crucial to focus on your retention rates if you want to be in business for the long-term. Effective user onboarding is an essential first step in achieving this. The best way to ensure that new sign-ups become active users is by delivering value. Your onboarding process should…

The post Hacking Your User Onboarding Process: 10 Strategies to Try Now appeared first on The Daily Egg.

Excerpt from:  

Hacking Your User Onboarding Process: 10 Strategies to Try Now

The Fine Art of Landing Page Design: Using F & Z Patterns to Increase Conversions

In a saturated online world with an abundance of information, marketers are constantly battling for attention. You’ve likely read that online users have an attention span less than that of a goldfish. Therefore, the more organized and straightforward your strategy is for converting a lead, the better. Over the last couple decades, eye-tracking studies have been performed to ascertain where consumer’s eyes move when they land on a web page. Jakob Nielsen even authored a book Eyetracking Web Usability which analyzes “1.5 million instances where users look at Web sites to understand how the human eyes interact with design.” Landing…

The post The Fine Art of Landing Page Design: Using F & Z Patterns to Increase Conversions appeared first on The Daily Egg.

Read the article:  

The Fine Art of Landing Page Design: Using F & Z Patterns to Increase Conversions

Running an A/A Test Before A/B Testing – Wise or Waste?

To A/A test or not is a question that invites conflicting opinions. Enterprises when faced with the decision of implementing an A/B testing tool do not have enough context on whether they should A/A test. Knowing the benefits and loopholes of A/A testing can help organizations make better decisions.

In this blog post we explore why some organizations practice A/A testing and the things they need to keep in mind while A/A testing. We also discuss other methods that can help enterprises decide whether or not to invest in a certain A/B testing tool.

Why Some Organizations Practice A/A Testing

A/A testing is done when organizations are taking up new implementation of an A/B testing tool. Running an A/A test at that time can help them with:

  • Checking the accuracy of an A/B Testing tool
  • Setting a baseline conversion rate for future A/B tests
  • Deciding a minimum sample size

Checking the Accuracy of an A/B Testing Tool

Organizations who are about to purchase an A/B testing tool or want to switch to a new testing software may run an A/A test to ensure that the new software works fine, and that it has been set up properly.

Tomasz Mazur, an eCommerce Conversion Rate Optimization expert, explains further: “A/A testing is a good way to run a sanity check before you run an A/B test. This should be done whenever you start using a new tool or go for new implementation. A/A testing in these cases will help check if there is any discrepancy in data, let’s say, between the number of visitors you see in your testing tool and the web analytics tool. Further, this helps ensure that your hypothesis are verified.”

In an A/A test, a web page is A/B tested against an identical variation. When there is absolutely no difference between the control and the variation, it is expected that the result will be inconclusive. However, in cases where an A/A test provides a winner between two identical variations, there is a problem. The reasons could be the following:

  • The tool has not been set up properly.
  • The test hasn’t been conducted correctly.
  • The testing tool is inefficient.

Here’s what Corte Swearingen, Director, A/B Testing and Optimization at American Eagle, has to say about A/A testing: “I typically will run an A/A test when a client seems uncertain about their testing platform, or needs/wants additional proof that the platform is operating correctly. There really is no better way to do this than to take the exact same page and test it against itself with no changes whatsoever. We’re essentially tricking the platform and seeing if it catches us! The bottom line is that while I don’t run A/A tests very often, I will occasionally use it as a proof of concept for a client, and to help give them confidence that the split testing platform they are using is working as it should.”

Determining the Baseline Conversion Rate

Before running any A/B test, you need to know the conversion rate that you will be benchmarking the performance results against. This benchmark is your baseline conversion rate.

An A/A test can help you set the baseline conversion rate for your website. Let’s explain this with the help of an example. Suppose you are running an A/A test where the control gives 303 conversions out of 10,000 visitors and the identical variation B gives 307 out of 10,000 conversions. The conversion rate for A is 3.03%, and that for B is 3.07%, when there is no difference between the two variations. Therefore, the conversion rate range that can be set as a benchmark for future A/B tests can be set at 3.03–3.07%. If you run an A/B test later and get an uplift within this range, this might mean that this result is not significant.

Deciding a Minimum Sample Size

A/A testing can also help you get an idea about the minimum sample size from your website traffic. A small sample size would not include sufficient traffic from multiple segments. You might miss out on a few segments which can potentially impact your test results. With a larger sample size, you have a greater chance of taking into account all segments that impact the test.

Corte says, “A/A testing can be used to make a client understand the importance of getting enough people through a test before assuming that a variation is outperforming the original.” He explains this with an A/A testing case study that was done for Sales Training Program landing pages for one of his clients, Dale Carnegie. The A/A test that was run on two identical landing pages got test results indicating that a variation was producing an 11.1% improvement over the control. The reason behind this was that the sample size being tested was too small.

a/a test initial results

After having run the A/A test for a period of 19 days and with over 22,000 visitors, the conversion rates between the two identical versions were the same.

a/a test results with more data

Michal Parizek, Senior eCommerce & Optimization Specialist at Avast, shares similar thoughts. He says, “At Avast, we did a comprehensive A/A test last year. And it gave us some valuable insights and was worth doing it!” According to him, “It is always good to check the statistics before final evaluation.”

At Avast, they ran an A/A test on  two main segments—customers using the free version of the product and customers using the paid version. They did so to get a comparison.

The A/A test had been live for 12 days, and they managed to get quite a lot of data. Altogether, the test involved more than 10 million users and more than 6,500 transactions.

In the “free” segment, they saw a 3% difference in the conversion rate and 4% difference in Average Order Value (AOV). In the “paid” segment, they saw a 2% difference in conversion and 1% difference in AOV.

“However, all uplifts were NOT statistically significant,” says Michal. He adds, “Particularly in the ‘free’ segment, the 7% difference in sales per user (combining the differences in the conversion rate and AOV) might look trustworthy enough to a lot of people. And that would be misleading. Given these results from the A/A test, we have decided to implement internal A/B testing guidelines/lift thresholds. For example, if the difference in the conversion rate or AOV is lower than 5%, be very suspicious that the potential lift is not driven by the difference in the design but by chance.”

Michal sums up his opinion by saying, “A/A testing helps discover how A/B testing could be misleading if they are not taken seriously. And it is also a great way to spot any bugs in the tracking and setup.”

Problems with A/A Testing

In a nutshell, the two main problems inherent in A/A testing are:

  • Everpresent element of randomness in any experimental setup
  • Requirement of a large sample size

We will consider these one by one:

Element of Randomness

As pointed out earlier in the post, checking the accuracy of a testing tool is the main reason for running an A/A test. However, what if you find out a difference between conversions of control and an identical variation? Do you always point it out as a bug in the A/B testing tool?

The problem (for the lack of a better word) with A/A testing is that there is always an element of randomness involved. In some cases, the experiment acquires statistical significance purely by chance, which means that the change in the conversion rate between A and its identical version is probabilistic and does not denote absolute certainty.  

Tomaz Mazur explains randomness with a real-world example. “Suppose you set up two absolutely identical stores in the same vicinity. It is likely, purely by chance or randomness, that there is a difference in results reported by the two. And it doesn’t always mean that the A/B testing platform is inefficient.”

Requirement of a Large Sample Size

Following the example/case study provided by Corte above, one problem with A/A testing is that it can be time-consuming. When testing identical versions, you need a large sample size to find out if A is preferred to its identical version. This in turn will take too much time.

As explained in one of the ConversionXL’s posts, “The amount of sample and data you need to prove that there is no significant bias is huge by comparison with an A/B test. How many people would you need in a blind taste testing of Coca-Cola (against Coca-Cola) to conclude that people liked both equally? 500 people, 5000 people?” Experts at ConversionXL explain that entire purpose of an optimization program is to reduce wastage of time, resources, and money.  They believe that even though running an A/A test is not wrong, there are better ways to use your time when testing.  In the post they mention, “The volume of tests you start is important but even more so is how many you *finish* every month and from how many of those you *learn* something useful from. Running A/A tests can eat into the “real” testing time.”

VWO’s Bayesian Approach and A/A Testing

VWO uses a Bayesian-based statistical engine for A/B testing. This allows VWO to deliver smart decisions–it tells you which variation will minimize potential loss.

Chris Stucchio, Director of Data Science at VWO, shares his viewpoint on how A/A testing is different in VWO than typical frequentist A/B testing tools.

Most A/B testing tools are seeking truth. When running an A/A test in a frequentist tool, an erroneous “winner” should only be reported 5% of the time. In contrast, VWO’s SmartStats is attempting to make a smart business decision. We report a smart decision when we are confident that a particular variation is not worse than all the other variations, that is, we are saying “you’ll leave very little money on the table if you choose this variation now.” In an A/A test, this condition is always satisfied—you’ve got nothing to lose by stopping the test now.

The correct way to evaluate a Bayesian test is to check whether the credible interval for lift contains 0% (the true value).

He also says that the possible and simplest reason for A/A tests to provide a winner

is random chance. “With a frequentist tool, 5% of A/A tests will return a winner due to bad luck. Similarly, 5% of A/A tests in a Bayesian tool will report erroneous lifts. Another possible reason is the configuration error; perhaps the javascript or HTML is incorrectly configured.”

Other Methods and Alternatives to A/A Testing

A few experts believe that A/A testing is inefficient as it consumes a lot of time that could otherwise be used in running actual A/B tests. However, there are others who say that it is essential to run a health check on your A/B testing tool. That said, A/A testing alone is not sufficient to establish whether one testing tool should be prefered over another. When making a critical business decision such as buying a new tool/software application for A/B testing, there are a number of other things that should be considered.

Corte points out that though there is no replacement or alternative to A/A testing, there are other things that must be taken into account when a new tool is being implemented. These are listed as follows:

  1.  Will the testing platform integrate with my web analytics program so that I can further slice and dice the test data for additional insight?
  2.  Will the tool let me isolate specific audience segments that are important to my business and just test those audience segments?
  3.  Will the tool allow me to immediately allocate 100% of my traffic to a winning variation? This feature can be an important one for more complicated radical redesign tests where standardizing on the variation may take some time. If your testing tool allows immediate 100% allocation to the winning variation, you can reap the benefits of the improvement while the page is built permanently in your CMS.
  4. Does the testing platform provide ways to collect both quantitative and qualitative information about site visitors that can be used for formulating additional test ideas? These would be tools like heatmap, scrollmap, visitor recordings, exit surveys, page-level surveys, and visual form funnels. If the testing platform does not have these integrated, do they allow integration with third-party tools for these services.
  5. Does the tool allow for personalization? If test results are segmented and it is discovered that one type of content works best for one segment and another type of content works better for a second segment, does the tool allow you to permanently serve these different experiences for different audience segments”?

That said, there is still a set of experts or people who would opt for alternatives such as triangulating data over an A/A test. Using this procedure means you have two sets of performance data to cross-check with each other. Use one analytics platform as the base to compare all other outcomes against, to check if there is something wrong or something that needs fixing.

And then there is the argument—why just A/A test when you can get more meaningful insights by running an A/A/B test. Doing this, you can still compare two identical versions while also testing some changes in the B variant.

Conclusion

When businesses face the decision of implementing a new testing software application, they need to run a thorough check on the tool. A/A testing is one method that some organizations use for checking the efficiency of the tool. Along with personalization and segmentation capabilities and some other pointers mentioned in this post, this technique can help check if the software application is good for implementation.

Did you find the post insightful? Drop us a line in the comments section with your feedback.

Free-trial CTA

The post Running an A/A Test Before A/B Testing – Wise or Waste? appeared first on VWO Blog.

Read original article:  

Running an A/A Test Before A/B Testing – Wise or Waste?

Building A First-Class App That Leverages Your Website: A Case Study

Mark Zuckerberg once said, “The biggest mistake that we made, as a company, is betting too much on HTML5 as opposed to native… because it just wasn’t there. And it’s not that HTML5 is bad. I’m actually, long term, really excited about it.” And who wouldn’t be excited by the prospect of a single code base that works across multiple platforms?
Further Reading on SmashingMag: A Beginner’s Guide To Progressive Web Apps The Building Blocks Of Progressive Web Apps Creating A Complete Web App In Foundation For Apps Unfortunately, Facebook felt that HTML5 didn’t offer the experience it was looking to build, and that’s what it’s really about: the experience.

More here: 

Building A First-Class App That Leverages Your Website: A Case Study

Thumbnail

Technical SEO 2015: Wiring Websites for Organic Search


Ask ten people what SEO is, and you’re likely to get ten different answers. Given the industry’s unsavoury past, this is hardly surprising. Keyword stuffing, gateway pages, and comment spam earned the first search engine optimisers a deservedly poor reputation within the web community.

Modern Technical SEO: Wiring Websites for Organic Search

Snake oil salesmen continue to peddle these harmful techniques to unsuspecting website owners today, perpetuating the myth that optimising your website for Google or Bing is an inherently nefarious practise. Needless to say, this is not true.

The post Technical SEO 2015: Wiring Websites for Organic Search appeared first on Smashing Magazine.

This article is from:  

Technical SEO 2015: Wiring Websites for Organic Search

Thumbnail

Continuous Input In Mobile Devices: Pain Or Gain?

Working with text has long been the domain of desktops and notebooks. Yet the screen size, resolution and software of mobile devices have improved in recent years, which has made typing a fairly large amount of text quite achievable. A number of apps and techniques are intended to make this task easier, thus increasing productivity and increasing the amount of text that can be comfortably created or edited on a mobile device.

See original: 

Continuous Input In Mobile Devices: Pain Or Gain?

Are Your Internal Systems Damaging Your Business?

The internal systems of many organizations have shocking user interfaces. This costs companies in productivity, training and even the customer experience.
Fortunately, we can fix this. “How come I can download an app on my phone and instantly know how to use it, yet need training to use our content management system? Shouldn’t our system be intuitive?”
This was just one of the comments I heard in a recent stakeholder interview.

Continue reading: 

Are Your Internal Systems Damaging Your Business?