Tag Archives: chrome

Automating Your Feature Testing With Selenium WebDriver




Automating Your Feature Testing With Selenium WebDriver

Nils Schütte



This article is for web developers who wish to spend less time testing the front end of their web applications but still want to be confident that every feature works fine. It will save you time by automating repetitive online tasks with Selenium WebDriver. You will find a step-by-step example for automating and testing the login function of WordPress, but you can also adapt the example for any other login form.

What Is Selenium And How Can It Help You?

Selenium is a framework for the automated testing of web applications. Using Selenium, you can basically automate every task in your browser as if a real person were to execute the task. The interface used to send commands to the different browsers is called Selenium WebDriver. Implementations of this interface are available for every major browser, including Mozilla Firefox, Google Chrome and Internet Explorer.

Automating Your Feature Testing With Selenium WebDriver

Which type of web developer are you? Are you the disciplined type who tests all key features of your web application after each deployment. If so, you are probably annoyed by how much time this repetitive testing consumes. Or are you the type who just doesn’t bother with testing key features and always thinks, “I should test more, but I’d rather develop new stuff.” If so, you probably only find bugs by chance or when your client or boss complains about them.

I have been working for a well-known online retailer in Germany for quite a while, and I always belonged to the second category: It was so exciting to think of new features for the online shop, and I didn’t like at all going over all of the previous features again after each new software deployment. So, the strategy was more or less to hope that all key features would work.

One day, we had a serious drop in our conversion rate and started digging in our web analytics tools to find the source of this drop. It took quite a while before we found out that our checkout did not work properly since the previous software deployment.

This was the day when I started to do some research about automating our testing process of web applications, and I stumbled upon Selenium and its WebDriver. Selenium is basically a framework that allows you to automate web browsers. WebDriver is the name of the key interface that allows you to send commands to all major browsers (mobile and desktop) and work with them as a real user would.

Preparing The First Test With Selenium WebDriver

First, I was a little skeptical of whether Selenium would suit my needs because the framework is most commonly used in Java, and I am certainly not a Java expert. Later, I learned that being a Java expert is not necessary to take advantage of the power of the Selenium framework.

As a simple first test, I tested the login of one of my WordPress projects. Why WordPress? Just because using the WordPress login form is an example that everybody can follow more easily than if I were to refer to some custom web application.

What do you need to start using Selenium WebDriver? Because I decided to use the most common implementation of Selenium in Java, I needed to set up my little Java environment.

If you want to follow my example, you can use the Java environment of your choice. If you haven’t set one up yet, I suggest installing Eclipse and making sure you are able to run a simple “Hello world” script in Java.

Because I wanted to test the login in Chrome, I made sure that the Chrome browser was already installed on my machine. That’s all I did in preparation.

Downloading The ChromeDriver

All major browsers provide their own implementation of the WebDriver interface. Because I wanted to test the WordPress login in Chrome, I needed to get the WebDriver implementation of Chrome: ChromeDriver.

I extracted the ZIP archive and stored the executable file chromedriver.exe in a location that I could remember for later.

Setting Up Our Selenium Project In Eclipse

The steps I took in Eclipse are probably pretty basic to someone who works a lot with Java and Eclipse. But for those like me, who are not so familiar with this, I will go over the individual steps:

  1. Open Eclipse.
  2. Click the “New” icon.
    Creating a new project in Eclipse
    Creating a new project in Eclipse
  3. Choose the wizard to create a new “Java Project,” and click “Next.”
    Chosing the java-project wizard
    Choose the java-project wizard.
  4. Give your project a name, and click “Finish.”
    Eclipse project wizard
    The eclipse project wizard
  5. Now you should see your new Java project on the left side of the screen.
    Java project successfully created
    We successfully created a project to run the Selenium WebDriver.

Adding The Selenium Library To Our Project

Now we have our Java project, but Selenium is still missing. So, next, we need to bring the Selenium framework into our Java project. Here are the steps I took:

  1. Download the latest version of the Java Selenium library.

    Downloading the Selenium library
    Download the Selenium library.
  2. Extract the archive, and store the folder in a place you can remember easily.
  3. Go back to Eclipse, and go to “Project” → “Properties.”
    Eclipse Properties
    Go to properties to integrate the Selenium WebDriver in you project.
  4. In the dialog, go to “Java Build Path” and then to register “Libraries.”
  5. Click on “Add External JARs.”
    Adding the Selenium lib to your Java build path.
    Add the Selenium lib to your Java build path.
  6. Navigate to the just downloaded folder with the Selenium library. Highlight all .jar files and click “Open.”
    Selecting the correct files of the Selenium lib.
    Select all files of the lib to add to your project.
  7. Repeat this for all .jar files in the subfolder libs as well.
  8. Eventually, you should see all .jar files in the libraries of your project:
    Selenium WebDriver framework successfully integrated into your project
    The Selenium WebDriver framework has now been successfully integrated into your project!

That’s it! Everything we’ve done until now is a one-time task. You could use this project now for all of your different tests, and you wouldn’t need to do the whole setup process for every test case again. Kind of neat, isn’t it?

Creating Our Testing Class And Letting It Open the Chrome Browser

Now we have our Selenium project, but what next? To see whether it works at all, I wanted to try something really simple, like just opening my Chrome browser.

To do this, I needed to create a new Java class from which I could execute my first test case. Into this executable class, I copied a few Java code lines, and believe it or not, it worked! Magically, the Chrome browser opened and, after a few seconds, closed all by itself.

Try it yourself:

  1. Click on the “New” button again (while you are in your new project’s folder).
    New class in eclipse
    Create a new class to run the Selenium WebDriver.
  2. Choose the “Class” wizard, and click “Next.”
    New class wizard in eclipse
    Choose the Java class wizard to create a new class.
  3. Name your class (for example, “RunTest”), and click “Finish.”
    Eclipse Java Class wizard
    The eclipse Java Class wizard.
  4. Replace all code in your new class with the following code. The only thing you need to change is the path to chromedriver.exe on your computer:
    import org.openqa.selenium.WebDriver;
    import org.openqa.selenium.chrome.ChromeDriver;
    
    /**
     * @author Nils Schuette via frontendtest.org
     */
    public class RunTest 
        static WebDriver webDriver;
        /**
         * @param args
         * @throws InterruptedException
         */
        public static void main(final String[] args) throws InterruptedException 
            // Telling the system where to find the chrome driver
            System.setProperty(
                    "webdriver.chrome.driver",
                    "C:/PATH/TO/chromedriver.exe");
    
            // Open the Chrome browser
            webDriver = new ChromeDriver();
    
            // Waiting a bit before closing
            Thread.sleep(7000);
    
            // Closing the browser and WebDriver
            webDriver.close();
            webDriver.quit();
        
    }
    
  5. Save your file, and click on the play button to run your code.
    Run Eclipse project
    Running your first Selenium WebDriver project.
  6. If you have done everything correctly, the code should open a new instance of the Chrome browser and close it shortly thereafter.
    Chrome Browser blank window
    The Chrome Browser opens itself magically. (Large preview)

Testing The WordPress Admin Login

Now I was optimistic that I could automate my first little feature test. I wanted the browser to navigate to one of my WordPress projects, login to the admin area and verify that the login was successful. So, what commands did I need to look up?

  1. Navigate to the login form,
  2. Locate the input fields,
  3. Type the username and password into the input fields,
  4. Hit the login button,
  5. Compare the current page’s headline to see if the login was successful.

Again, after I had done all the necessary updates to my code and clicked on the run button in Eclipse, my browser started to magically work itself through the WordPress login. I successfully ran my first automated website test!

If you want to try this yourself, replace all of the code of your Java class with the following. I will go through the code in detail afterwards. Before executing the code, you must replace four values with your own:

  1. The location of your chromedriver.exe file (as above),

  2. The URL of the WordPress admin account that you want to test,

  3. The WordPress username,

  4. The WordPress password.

Then, save and let it run again. It will open Chrome, navigate to the login of your WordPress website, login and check whether the h1 headline of the current page is “Dashboard.”

import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.chrome.ChromeDriver;

/**
 * @author Nils Schuette via frontendtest.org
 */
public class RunTest 
    static WebDriver webDriver;
    /**
     * @param args
     * @throws InterruptedException
     */
    public static void main(final String[] args) throws InterruptedException 
        // Telling the system where to find the chrome driver
        System.setProperty(
                "webdriver.chrome.driver",
                "C:/PATH/TO/chromedriver.exe");

        // Open the Chrome browser
        webDriver = new ChromeDriver();

        // Maximize the browser window
        webDriver.manage().window().maximize();

        if (testWordpresslogin()) 
            System.out.println("Test WordPress Login: Passed");
         else 
            System.out.println("Test WordPress Login: Failed");

        

        // Close the browser and WebDriver
        webDriver.close();
        webDriver.quit();
    }

    private static boolean testWordpresslogin() 
        try 
            // Open google.com
            webDriver.navigate().to("https://www.YOUR-SITE.org/wp-admin/");

            // Type in the username
            webDriver.findElement(By.id("user_login")).sendKeys("YOUR_USERNAME");

            // Type in the password
            webDriver.findElement(By.id("user_pass")).sendKeys("YOUR_PASSWORD");

            // Click the Submit button
            webDriver.findElement(By.id("wp-submit")).click();

            // Wait a little bit (7000 milliseconds)
            Thread.sleep(7000);

            // Check whether the h1 equals “Dashboard”
            if (webDriver.findElement(By.tagName("h1")).getText()
                    .equals("Dashboard")) 
                return true;
             else 
                return false;
            

        // If anything goes wrong, return false.
        } catch (final Exception e) 
            System.out.println(e.getClass().toString());
            return false;
        
    }
}

If you have done everything correctly, your output in the Eclipse console should look something like this:

Eclipse console test result.
The Eclipse console states that our first test has passed. (Large preview)

Understanding The Code

Because you are probably a web developer and have at least a basic understanding of other programming languages, I am sure you already grasp the basic idea of the code: We have created a separate method, testWordpressLogin, for the specific test case that is called from our main method.

Depending on whether the method returns true or false, you will get an output in your console telling you whether this specific test passed or failed.

This is not necessary, but this way you can easily add many more test cases to this class and still have readable code.

Now, step by step, here is what happens in our little program:

  1. First, we tell our program where it can find the specific WebDriver for Chrome.
    System.setProperty("webdriver.chrome.driver","C:/PATH/TO/chromedriver.exe");
  2. We open the Chrome browser and maximize the browser window.
    webDriver = new ChromeDriver();
    webDriver.manage().window().maximize();
  3. This is where we jump into our submethod and check whether it returns true or false.
    if (testWordpresslogin()) …
  4. The following part in our submethod might not be intuitive to understand:
    The try…catch… blocks. If everything goes as expected, only the code in try… will be executed, but if anything goes wrong while executing try…, then the execution continuous in catch{}. Whenever you try to locate an element with findElement and the browser is not able to locate this element, it will throw an exception and execute the code in catch…. In my example, the test will be marked as “failed” whenever something goes wrong and the catch{} is executed.
  5. In the submethod, we start by navigating to our WordPress admin area and locating the fields for the username and the password by looking for their IDs. Also, we type the given values in these fields.
    webDriver.navigate().to("https://www.YOUR-SITE.org/wp-admin/");
    webDriver.findElement(By.id("user_login")).sendKeys("YOUR_USERNAME");
    webDriver.findElement(By.id("user_pass")).sendKeys("YOUR_PASSWORD");

    Wordpress login form
    Selenium fills out our login form
  6. After filling in the login form, we locate the submit button by its ID and click it.
    webDriver.findElement(By.id("wp-submit")).click();
  7. In order to follow the test visually, I include a 7-second pause here (7000 milliseconds = 7 seconds).
    Thread.sleep(7000);
  8. If the login is successful, the h1 headline of the current page should now be “Dashboard,” referring to the WordPress admin area. Because the h1 headline should exist only once on every page, I have used the tag name here to locate the element. In most other cases, the tag name is not a good locator because an HTML tag name is rarely unique on a web page. After locating the h1, we extract the text of the element with getText() and check whether it is equal to the string “Dashboard.” If the login is not successful, we would not find “Dashboard” as the current h1. Therefore, I’ve decided to use the h1 to check whether the login is successful.
    if (webDriver.findElement(By.tagName("h1")).getText().equals("Dashboard")) 
        
            return true;
         else 
            return false;
        
    

    Wordpress Dashboard
    Letting the WebDriver check, whether we have arrived on the Dashboard: Test passed! (Large preview)
  9. If anything has gone wrong in the previous part of the submethod, the program would have jumped directly to the following part. The catch block will print the type of exception that happened to the console and afterwards return false to the main method.
    catch (final Exception e) 
                System.out.println(e.getClass().toString());
                return false;
            

Adapting The Test Case

This is where it gets interesting if you want to adapt and add test cases of your own. You can see that we always call methods of the webDriver object to do something with the Chrome browser.

First, we maximize the window:

webDriver.manage().window().maximize();

Then, in a separate method, we navigate to our WordPress admin area:

webDriver.navigate().to("https://www.YOUR-SITE.org/wp-admin/");

There are other methods of the webDriver object we can use. Besides the two above, you will probably use this one a lot:

webDriver.findElement(By. …)

The findElement method helps us find different elements in the DOM. There are different options to find elements:

  • By.id
  • By.cssSelector
  • By.className
  • By.linkText
  • By.name
  • By.xpath

If possible, I recommend using By.id because the ID of an element should always be unique (unlike, for example, the className), and it is usually not affected if the structure of your DOM changes (unlike, say, the xPath).

Note: You can read more about the different options for locating elements with WebDriver over here.

As soon as you get ahold of an element using the findElement method, you can call the different available methods of the element. The most common ones are sendKeys, click and getText.

We’re using sendKeys to fill in the login form:

webDriver.findElement(By.id("user_login")).sendKeys("YOUR_USERNAME");

We have used click to submit the login form by clicking on the submit button:

webDriver.findElement(By.id("wp-submit")).click();

And getText has been used to check what text is in the h1 after the submit button is clicked:

webDriver.findElement(By.tagName("h1")).getText()

Note: Be sure to check out all the available methods that you can use with an element.

Conclusion

Ever since I discovered the power of Selenium WebDriver, my life as a web developer has changed. I simply love it. The deeper I dive into the framework, the more possibilities I discover — running one test simultaneously in Chrome, Internet Explorer and Firefox or even on my smartphone, or taking screenshots automatically of different pages and comparing them. Today, I use Selenium WebDriver not only for testing purposes, but also to automate repetitive tasks on the web. Whenever I see an opportunity to automate my work on the web, I simply copy my initial WebDriver project and adapt it to the next task.

If you think that Selenium WebDriver is for you, I recommend looking at Selenium’s documentation to find out about all of the possibilities of Selenium (such as running tasks simultaneously on several (mobile) devices with Selenium Grid).

I look forward to hearing whether you find WebDriver as useful as I do!

Smashing Editorial
(rb, ra, al, il)


Original link: 

Automating Your Feature Testing With Selenium WebDriver

The Rise Of Intelligent Conversational UI




The Rise Of Intelligent Conversational UI

Burke Holland



For a long time, we’ve thought of interfaces strictly in a visual sense: buttons, dropdown lists, sliders, carousels (please no more carousels). But now we are staring into a future composed not just of visual interfaces, but of conversational ones as well. Microsoft alone reports that three thousand new bots are built every week on their bot framework. Every. Week.

The importance of Conversational UI cannot be understated, even if some of us wish it wasn’t happening.

The most important advancement in Conversational UI has been Natural Language Processing (NLP). This is the field of computing that deals not with deciphering the exact words that a user said, but with parsing out of it their actual intent. If the bot is the interface, NLP is the brain. In this article, we’re going to take a look at why NLP is so important, and how you (yes, you!) can build your own.

Speech Recognition vs. NLP

Most people will be familiar with Amazon Echo, Cortana, Siri or Google Home, all of which have an interface that is primarily conversational. They are also all using NLP.




Large preview

Aside from these intelligent assistants, most Conversational UIs have nothing to do with voice at all. They are text driven. These are the bots we chat with in Slack, Facebook Messenger or over SMS. They deliver high quality gifs in our chats, watch our build processes and even manage our pull requests.




Large preview

Conversational UIs built on text are nice because there is no speech recognition component. The text is already parsed.

When it comes to a verbal interaction, the fundamental problem is not recognizing the speech. We’ve mostly got that one down.

OK, so maybe it’s not perfect. I still get voicemails every day like a game of Mad Libs that I never asked to play. iOS just sticks a blank line in whenever they don’t know what exactly was said.




Large preview

Google, on the other hand, just tries to guess. Like this one from my father. I have absolutely no idea what this message is actually trying to say other than “Be Safe” which honestly sounds like my mom, and not my dad. I have a hard time believing he ever said that. I don’t trust the computer.




Large preview

I’m picking on voice mail transcriptions here, which might be the hardest speech recognition to do given how degraded the audio quality is.

Nevertheless, speech recognition is largely a solved problem. It’s even built right into Chrome and it works remarkably well.




Large preview

After we solved the problem of speech recognition, we started to use it everywhere. That was unfortunate because speech recognition on it’s own doesn’t do us a whole lot of good. Interfaces that rely soley on speech recognition require the user to state things a precise way and they can only state the limited number of exact words or phrases that the interface knows about. This is not natural. This is not how a conversation works.

Without NLP, Conversational UI can be true nightmare.

Conversational UI Without NLP

We’re probably all familiar with automated phone menus. These are known as Interactive Voice Response systems — or IVRs for short. They are designed to take the place of the traditional operator and automatically transfer callers to the right place without having to talk to a human. On the surface, this seems like a good idea. In practice, it’s mostly just you waiting while a recorded voice reads out a list of menu items that “may have changed.”




Large preview

A 2011 study from New York University found that 83% of people feel IVR systems “provide either no benefit at all, or only a cost savings benefit to the company.” They also noted that IVR systems “score lower than any other service option.” People would literally rather do anything else than use an automated phone menu.

NLP has changed the IVR market rather significantly in the past few years. NLP can pick a user’s intent out of anything they say, so it’s better to just let them say it and then determine if you support the action.

Check out how AT&T does it.

AT&T has a truly intelligent Conversational UI. It uses NLP to let me just state my intent. Also, notice that I don’t have to know what to say. I can fumble all around and it still picks out my intent.

AT&T also uses information that it already has (my phone number) and then leverages text messaging to send me a link to a traditional visual UI, which is probably a much better UX for making a payment. NLP drives the whole experience here. Without it, the rest of the interaction would not be nearly as smooth.

NLP is powerful, but more importantly, it is also accessible to developers everywhere. You don’t have to know a thing about Machine Learning (ML) or Artificial Intelligence (AI) to use it. All you need to how to do is make an AJAX call. Even I can do that!

Building An NLP Interface

So much of Machine Learning still remains inaccessible to developers. Even the best YouTube videos on the subject quickly become hard to follow with subjects like Neural Networks and Gradient Descents. We have, however, made significant progress in the field of Language Processing, to the point that it’s accessible to developers of nearly any skill level.

Natural Language Processing differs based on the service, but the overall idea is that the user has an intent, and that intent contains entities. That means exactly nothing to you at the moment, so let’s work up a hypothetical Home Automation bot and see how this works.

The Home Automation Example

In the field of Natural Language Processing, the canonical “Hello World” is usually a Home Automation demo. This is because it helps to clearly demonstrate the fundamental concepts of NLP without overloading your brain.

A Home Automation Bot is a service that can control hypothetical lights in a hypothetical house. For instance, we might want to say “Turn on the kitchen lights”. That is our intent. If we said “Hello”, we are clearly expressing a different intent. Inside of that intent, there are two pieces of information that we need to complete the action:

  1. The ‘Location’ of the light (kitchen)
  2. The desired state of the lights ‘Power’ (on/off)

These (Location, Power) are known as entities.

When we are finished designing our NLP interface, we are going to be able to call an HTTP endpoint and pass it our intent: “Turn on the kitchen lights.” That endpoint will return to us the intent (Control Lights) and two objects representing our entities: Location and Power. We can then pass those into a function which actually controls our lights…

function controlLights(location, power) 
  console.log(`Turning $power the $location lights`);
  
  // TODO: Call an imaginary endpoint which controls lights   
}

There are a lot of NLP services out there that are available today for developers. For this example, I’m going to show the LUIS project from Microsoft because it is free to use.

LUIS is a completely visual tool, so we won’t actually be writing any code at all. We’ve already talked about Intents and Entities, so you already know most of the terminology that you need to know to build this interface.

The first step is to create a “Control Lights” intent in LUIS.




Large preview

Before I do anything with that intent, I need to define my Location and Power entities. Entities can be different types — kind of like types in a programming language. You can have dates, lists and even entities that are related to other entities. In this case, Power is a list of values (on, off) and Location is a simple entity, which can be any value.

It will be up to LUIS to be smart enough to figure out exactly what the Location is.




Large preview




Large preview

Now we can begin to train this model to understand all of the different ways that we might ask it to control the lights in a different location. Let’s think of all the different ways that we could do that:

  • Turn off the kitchen lights;
  • Turn off the lights in the office;
  • The lights in the living room, turn them on;
  • Lights, kitchen, off;
  • Turn off the lights (no location).

As I feed these into the Control Lights intent as utterances, LUIS tries to determine where in the intent the entities are. You can see that because Power is a discreet list of values, it gets that right every time.




Large preview

But it has no idea what a Location even is. LUIS wants us to go through this list and tell it where the Location is. That’s done by clicking on a word or group of words and assigning to the right entity. As we are doing this, we are really creating a machine learning model that LUIS is going to use to statistically estimate what qualifies as a Location.




Large preview

When I’m done telling LUIS where in these utterances all the locations are, my dashboard looks like this…




Large preview

Now we train the model by clicking on the “Train” button at the top. Do you feel like a data scientist yet?

Now I can test it using the test panel. You can see that LUIS is already pretty smart. The Power is easy to pick out, but it can actually pick out Locations it has never seen before. It’s doing what your brain does — using the information that it has to make an educated guess. Machine Learning is equal parts impressive and scary.




Large preview

If we try hard enough, we can fool the AI. The more utterances we give it and label, the smarter it will get. I added 35 utterances to mine before I was done and it is close to bullet proof.

So now we get to the important part, which is how we actually use this NLP in an app. LUIS has a “Publish” menu option which allows us to publish our model to the internet where it’s exposed via a single HTTP endpoint. It will look something like this…

https://westus.api.cognitive.microsoft.com/luis/v2.0/apps/c4396135-ee3f-40a9-8b83-4704cddabf7a?subscription-key=19d29a12d3fc4d9084146b466638e62a&verbose=true&timezoneOffset=0&q=

The very last part of that query string is a q= variable. This is where we would pass our intent.

https://westus.api.cognitive.microsoft.com/luis/v2.0/apps/c4396135-ee3f-40a9-8b83-4704cddabf7a?subscription-key=19d29a12d3fc4d9084146b466638e62a&verbose=true&timezoneOffset=0&q=turn on the kitchen lights

The response that we get back looks is just a JSON object.


  "query": "turn on the kitchen lights",
  "topScoringIntent": 
    "intent": "Control Lights",
    "score": 0.999999046
  ,
  "intents": [
    
      "intent": "Control Lights",
      "score": 0.999999046
    ,
    
      "intent": "None",
      "score": 0.0532306843
    
  ],
  "entities": [
    
      "entity": "kitchen",
      "type": "Location",
      "startIndex": 12,
      "endIndex": 18,
      "score": 0.9516622
    ,
    
      "entity": "on",
      "type": "Power",
      "startIndex": 5,
      "endIndex": 6,
      "resolution": 
        "values": [
          "on"
        ]
      
    }
  ]
}

Now this is something that we can work with as developers! This is how you add NLP to any project — with a single REST endpoint. Now you’re free to create a bot with some real brains!

Brian Holt used the browser speech API and a LUIS model to create a voice powered calculator that is running right inside of CodePen. Chrome is required for the speech API.

See the Pen Voice Calculator by Brian Holt (@btholt) on CodePen.

Bot Design Is Still Hard

Having a smart bot is only half the battle. We still need to account for any of the actions that our system might expose, and that can lead to a lot of different logical paths which makes for messy code.

Conversations also happen in stages, so the bot needs to be able to intelligently direct users down the right path without frustrating them or being unable to recover when something goes wrong. It needs to be able to recover when the conversation dies midstream and then starts again. That’s a whole other article and I’ve included some resources below to help.

When it comes to language understanding, the AI platforms are mature and ready to use today. While that won’t help you perfectly design your bot, it will be a key component to building a bot that people don’t hate.

Great UI Is Just Great UI

A final note: As we saw from the AT&T example, a truly smart interface combines great speech recognition, Natural Language Processing, different types of conversational UI (speech and text) and even a visual UI. In short, great UI is just that — great UI — and it is not a zero sum game. Great UIs will leverage all of the technology available to provide the best possible user experience.

Special thanks to Mat Velloso for his input on this article.

Further Resources:

Smashing Editorial
(rb, ra, yk, il)


Visit site: 

The Rise Of Intelligent Conversational UI

Understanding Logical Properties And Values

In the past, CSS has tied itself to physical dimensions and directions, physically mapping the placement of elements to the left, right and top and bottom. We float an element left or right, we use the positioning offset properties top, left, bottom and right. We set margins, padding, and borders as margin-top and padding-left. These physical properties and values make sense if you are working in a horizontal, top to bottom, left to right writing mode and direction.

More – 

Understanding Logical Properties And Values

Getting Started With The Web MIDI API

As the web continues to evolve and new browser technologies continue to emerge, the line between native and web development becomes more and more blurred. New APIs are unlocking the ability to code entirely new categories of software in the browser.

Until recently, the ability to interact with digital musical instruments has been limited to native, desktop applications. The Web MIDI API is here to change that.

In this article, we’ll cover the basics of MIDI and the Web MIDI API to see how simple it can be to create a web app that responds to musical input using JavaScript.

What Is MIDI?

MIDI has been around for a long time but has only recently made its debut in the browser. MIDI (Musical Instrument Digital Interface) is a technical standard that was first published in 1983 and created the means for digital instruments, synthesizers, computers, and various audio devices to communicate with each other. MIDI messages relay musical and time-based information back and forth between devices.

A typical MIDI setup might consist of a digital piano keyboard which can send messages relaying information such as pitch, velocity (how loudly or softly a note is played), vibrato, volume, panning, modulation, and more to a sound synthesizer which converts that into audible sound. The keyboard could also send its signals to desktop music scoring software or a digital audio workstation (DAW) which could then convert the signals into written notation, save them as a sound file, and more.

MIDI is a fairly versatile protocol, too. In addition to playing and recording music, it has become a standard protocol in stage and theater applications, as well, where it is often used to relay cue information or control lighting equipment.


A performer plays a digital piano onstage


MIDI lets digital instruments, synthesizers, and software talk to each other and is frequently used in live shows and theatrical productions. Photo by Puk Khantho on Unsplash.

MIDI In The Browser

The WebMIDI API brings all the utility of MIDI to the browser with some pretty simple JavaScript. We only need to learn about a few new methods and objects.

Introduction

First, there’s the navigator.requestMIDIAccess() method. It does exactly what it sounds like—it will request access to any MIDI devices (inputs or outputs) connected to your computer. You can confirm the browser supports the API by checking for the existence of this method.

if (navigator.requestMIDIAccess) 
    console.log('This browser supports WebMIDI!');
 else 
    console.log('WebMIDI is not supported in this browser.');

Second, there’s the MIDIAccess object which contains references to all available inputs (such as piano keyboards) and outputs (such as synthesizers). The requestMIDIAccess() method returns a promise, so we need to establish success and failure callbacks. And if the browser is successful in connecting to your MIDI devices, it will return a MIDIAccess object as an argument to the success callback.

navigator.requestMIDIAccess()
    .then(onMIDISuccess, onMIDIFailure);

function onMIDISuccess(midiAccess) 
    console.log(midiAccess);

    var inputs = midi.inputs;
    var outputs = midi.outputs;


function onMIDIFailure() 
    console.log('Could not access your MIDI devices.');

Third, MIDI messages are conveyed back and forth between inputs and outputs with a MIDIMessageEvent object. These messages contain information about the MIDI event such as pitch, velocity (how softly or loudly a note is played), timing, and more. We can start collecting these messages by adding simple callback functions (listeners) to our inputs and outputs.

Going Deeper

Let’s dig in. To send MIDI messages from our MIDI devices to the browser, we’ll start by adding an onmidimessage listener to each input. This callback will be triggered whenever a message is sent by the input device, such as the press of a key on the piano.

We can loop through our inputs and assign the listener like this:

function onMIDISuccess(midiAccess) 
    for (var input of midiAccess.inputs.values())
        input.onmidimessage = getMIDIMessage;
    
}

function getMIDIMessage(midiMessage) 
    console.log(midiMessage);

The MIDIMessageEvent object we get back contains a lot of information, but what we’re most interested in is the data array. This array typically contains three values (e.g. [144, 72, 64]). The first value tells us what type of command was sent, the second is the note value, and the third is velocity. The command type could be either “note on,” “note off,” controller (such as pitch bend or piano pedal), or some other kind of system exclusive (“sysex”) event unique to that device/manufacturer.

For the purposes of this article, we’ll just focus on properly identifying “note on” and “note off” messages. Here are the basics:

  • A command value of 144 signifies a “note on” event, and 128 typically signifies a “note off” event.
  • Note values are on a range from 0–127, lowest to highest. For example, the lowest note on an 88-key piano has a value of 21, and the highest note is 108. A “middle C” is 60.
  • Velocity values are also given on a range from 0–127 (softest to loudest). The softest possible “note on” velocity is 1.
  • A velocity of 0 is sometimes used in conjunction with a command value of 144 (which typically represents “note on”) to indicate a “note off” message, so it’s helpful to check if the given velocity is 0 as an alternate way of interpreting a “note off” message.

Given this knowledge, we can expand our getMIDIMessage handler example above by intelligently parsing our MIDI messages coming from our inputs and passing them along to additional handler functions.

function getMIDIMessage(message) 
    var command = message.data[0];
    var note = message.data[1];
    var velocity = (message.data.length > 2) ? message.data[2] : 0; // a velocity value might not be included with a noteOff command

    switch (command) 
        case 144: // noteOn
            if (velocity > 0) 
                noteOn(note, velocity);
             else 
                noteOff(note);
            
            break;
        case 128: // noteOff
            noteOff(note);
            break;
        // we could easily expand this switch statement to cover other types of commands such as controllers or sysex
    }
}

Browser Compatibility And Polyfill

As of the writing of this article, the Web MIDI API is only available natively in Chrome, Opera, and Android WebView.


Browser support for Web MIDI API from caniuse.com


The Web MIDI API is only available natively in Chrome, Opera, and Android WebView.

For all other browsers that don’t support it natively, Chris Wilson’s WebMIDIAPIShim library is a polyfill for the Web MIDI API, of which Chris is a co-author. Simply including the shim script on your page will enable everything we’ve covered so far.

<script src="WebMIDIAPI.min.js"></script>    
<script>
if (navigator.requestMIDIAccess)  //... returns true
</script>

This shim also requires Jazz-Soft.net’s Jazz-Plugin to work, unfortunately, which means it’s an OK option for developers who want the flexibility to work in multiple browsers, but an extra barrier to mainstream adoption. Hopefully, within time, other browsers will adopt the Web MIDI API natively.

Making Our Job Easier With WebMIDI.js

We’ve only really scratched the surface of what’s possible with the WebMIDI API. Adding support for additional functionality besides basic “note on” and “note off” messages starts to get much more complex.

If you’re looking for a great JavaScript library to radically simplify your code, check out WebMidi.js by Jean-Philippe Côté on Github. This library does a great job of abstracting all the parsing of MIDIAccess and MIDIMessageEvent objects and lets you listen for specific events and add or remove listeners in a much simpler way.

WebMidi.enable(function () 

    // Viewing available inputs and outputs
    console.log(WebMidi.inputs);
    console.log(WebMidi.outputs);

    // Retrieve an input by name, id or index
    var input = WebMidi.getInputByName("My Awesome Keyboard");
    // OR...
    // input = WebMidi.getInputById("1809568182");
    // input = WebMidi.inputs[0];

    // Listen for a 'note on' message on all channels
    input.addListener('noteon', 'all',
        function (e) 
            console.log("Received 'noteon' message (" + e.note.name + e.note.octave + ").");
        
    );

    // Listen to pitch bend message on channel 3
    input.addListener('pitchbend', 3,
        function (e) 
            console.log("Received 'pitchbend' message.", e);
        
    );

    // Listen to control change message on all channels
    input.addListener('controlchange', "all",
        function (e) 
            console.log("Received 'controlchange' message.", e);
        
    );

    // Remove all listeners for 'noteoff' on all channels
    input.removeListener('noteoff');

    // Remove all listeners on the input
    input.removeListener();

});

Real-World Scenario: Building A Breakout Room Controlled By A Piano Keyboard

A few months ago, my wife and I decided to build a “breakout room” experience in our house to entertain our friends and family. We wanted the game to include some kind of special effect to help elevate the experience. Unfortunately, neither of us have mad engineering skills, so building complex locks or special effects with magnets, lasers, or electrical wiring was outside the realm of our expertise. I do, however, know my way around the browser pretty well. And we have a digital piano.

Thus, an idea was born. We decided that the centerpiece of the game would be a series of passcode locks on a computer that players would have to “unlock” by playing certain note sequences on our piano, a la Willy Wonka.

This is a musical lock
This is a musical lock

Sound cool? Here’s how I did it.

Setup

We’ll begin by requesting WebMIDI access, identifying our keyboard, attaching the appropriate event listeners, and creating a few variables and functions to help us step through the various stages of the game.

// Variable which tell us what step of the game we're on. 
// We'll use this later when we parse noteOn/Off messages
var currentStep = 0;

// Request MIDI access
if (navigator.requestMIDIAccess) 
    console.log('This browser supports WebMIDI!');

    navigator.requestMIDIAccess().then(onMIDISuccess, onMIDIFailure);

 else 
    console.log('WebMIDI is not supported in this browser.');


// Function to run when requestMIDIAccess is successful
function onMIDISuccess(midiAccess) 
    var inputs = midiAccess.inputs;
    var outputs = midiAccess.outputs;

    // Attach MIDI event "listeners" to each input
    for (var input of midiAccess.inputs.values()) 
        input.onmidimessage = getMIDIMessage;
    
}

// Function to run when requestMIDIAccess fails
function onMIDIFailure() 
    console.log('Error: Could not access MIDI devices.');


// Function to parse the MIDI messages we receive
// For this app, we're only concerned with the actual note value,
// but we can parse for other information, as well
function getMIDIMessage(message) 
    var command = message.data[0];
    var note = message.data[1];
    var velocity = (message.data.length > 2) ? message.data[2] : 0; // a velocity value might not be included with a noteOff command

    switch (command) 
        case 144: // note on
            if (velocity > 0) 
                noteOn(note);
             else 
                noteOff(note);
            
            break;
        case 128: // note off
            noteOffCallback(note);
            break;
        // we could easily expand this switch statement to cover other types of commands such as controllers or sysex
    }
}

// Function to handle noteOn messages (ie. key is pressed)
// Think of this like an 'onkeydown' event
function noteOn(note) 
    //...


// Function to handle noteOff messages (ie. key is released)
// Think of this like an 'onkeyup' event
function noteOff(note) 
    //...


// This function will trigger certain animations and advance gameplay 
// when certain criterion are identified by the noteOn/noteOff listeners
// For instance, a lock is unlocked, the timer expires, etc.
function runSequence(sequence) 
    //...

Step 1: Press Any Key To Begin

To kick off the game, let’s have the players press any key to begin. This is an easy first step which will clue them into how the game works and also start a countdown timer.

function noteOn(note) 
    switch(currentStep) 
        // If the game hasn't started yet.
        // The first noteOn message we get will run the first sequence
        case 0: 
            // Run our start up sequence
            runSequence('gamestart');

            // Increment the currentStep so this is only triggered once
            currentStep++;
            
            break;
    
}

function runSequence(sequence) 
    switch(sequence) 
        case 'gamestart':            
            // Now we'll start a countdown timer...
            startTimer();
            
            // code to trigger animations, give a clue for the first lock
            break;
    
}

Step 2: Play The Correct Note Sequence

For the first lock, the players must play a particular sequence of notes in the right order. I’ve actually seen this done in a real breakout room, only it was with an acoustic upright piano rigged to a lock box. Let’s re-create the effect with MIDI.

For every “note on” message received, we’ll append the numeric note value to an array and then check to see if that array matches a predefined array of note values.

We’ll assume some clues in the breakout room have told the players which notes to play. For this example, it will be the beginning of the tune to “Amazing Grace” in the key of F major. That note sequence would look like this.


A visual representation of the first nine notes of “Amazing Grace” on a piano


This is the correct sequence of notes that we’ll be listening for as the solution to the first lock.

The MIDI note values in array form would be: [60, 65, 69, 65, 69, 67, 65, 62, 60].

var correctNoteSequence = [60, 65, 69, 65, 69, 67, 65, 62, 60]; // Amazing Grace in F
var activeNoteSequence = [];

function noteOn(note) 
    switch(currentStep) 
        // ... (case 0)

        // The first lock - playing a correct sequence
        case 1:
            activeNoteSequence.push(note);

            // when the array is the same length as the correct sequence, compare the two
            if (activeNoteSequence.length == correctNoteSequence.length) 
                var match = true;
                for (var index = 0; index < activeNoteSequence.length; index++) 
                    if (activeNoteSequence[index] != correctNoteSequence[index]) 
                        match = false;
                        break;
                    
                }

                if (match) 
                    // Run the next sequence and increment the current step
                    runSequence('lock1');
                    currentStep++;
                 else 
                    // Clear the array and start over
                    activeNoteSequence = [];
                
            }
            break;
    }
}

function runSequence(sequence) 
    switch(sequence) 
        // ...

        case 'lock1':
            // code to trigger animations and give clue for the next lock
            break;
    
}

Step 3: Play The Correct Chord

The next lock requires the players to play a combination of notes at the same time. This is where our “note off” listener comes in. For every “note on” message received, we’ll add that note value to an array; for every “note off” message received, we’ll remove that note value from the array. Therefore, this array will reflect which notes are currently being pressed at any time. Then, it’s a matter of checking that array every time a note value is added to see if it matches a master array with the correct values.

For this clue, we’ll make the correct answer a C7 chord in root position starting on middle C. That looks like this.


A visual representation of a C7 chord on a piano


These are the four notes that we’ll be listening for as the solution to the second lock.

The correct MIDI note values for this chord are: [60, 64, 67, 70].

var correctChord = [60, 64, 67, 70]; // C7 chord starting on middle C
var activeChord = [];

function noteOn(note) 
    switch(currentStep) 
        // ... (case 0, 1)

        case 2:
            // add the note to the active chord array
            activeChord.push(note);

            // If the array is the same length as the correct chord, compare
            if (activeChord.length == correctChord.length) 
                var match = true;
                for (var index = 0; index < activeChord.length; index++) 
                    if (correctChord.indexOf(activeChord[index]) < 0) 
                        match = false;
                        break;
                    
                }

                if (match) 
                    runSequence('lock2');
                    currentStep++;
                
            }
            break;
    }

function noteOff(note) 
    switch(currentStep) 
        case 2:
            // Remove the note value from the active chord array
            activeChord.splice(activeChord.indexOf(note), 1);
            break;
    
}

function runSequence(sequence) 
    switch(sequence) 
        // ...

        case 'lock2':
            // code to trigger animations, stop clock, end game
            stopTimer();

            break;
    
}

Now all that’s left to do is to add some additional UI elements and animations and we have ourselves a working game!

Here’s a video of the entire gameplay sequence from start to finish. This is running in Google Chrome. Also shown is a virtual MIDI keyboard to help visualize which notes are currently being played. For a normal breakout room scenario, this can run in full-screen mode and with no other inputs in the room (such as a mouse or computer keyboard) to prevent users from closing the window.

WebMIDI Breakout Game Demo (watch in Youtube)

If you don’t have a physical MIDI device laying around and you still want to try it out, there are a number of virtual MIDI keyboard apps out there that will let you use your computer keyboard as a musical input, such as VMPK. Also, if you’d like to further dissect everything that’s going on here, check out the complete prototype on CodePen.

See the Pen WebMIDI Breakout Room Demo by Peter Anglea (@peteranglea) on CodePen.

Conclusion

MIDI.org says that “Web MIDI has the potential to be one of the most disruptive music [technologies] in a long time, maybe as disruptive as MIDI was originally back in 1983.” That’s a tall order and some seriously high praise.

I hope this article and sample app has gotten you excited about the potential that this API has to spur the development of new and exciting kinds of browser-based music applications. In the coming years, hopefully, we’ll start to see more online music notation software, digital audio workstations, audio visualizers, instrument tutorials, and more.

If you want to read more about Web MIDI and its capabilities, I recommend the following:

And for further inspiration, here are some other examples of the Web MIDI API in action:

Smashing Editorial
(rb, ra, hj, il)

Read More:

Getting Started With The Web MIDI API

Monthly Web Development Update 3/2018: Service Workers, Building A CDN, And Cheating At Design

Service Worker is probably one of the most misrepresented technologies we currently have. When I hear people talking about it, the topic almost always revolves around serving an app when a user is offline. However, Service Worker can do so much more than that, and every week I come across new articles that show how powerful the technology really is.

This month, for example, we can learn how to use Service Worker for cross-tab messaging and to load off requests into the background with the Background Sync API. I think the toolset we now have in our browsers already allows us to build great experiences regardless of the network state. Now it’s up to us to make the experiences so great that users truly love them. And that’s probably the hardest part.

News

Sketch 49
Sketch 49 has arrived, and with it comes the new Prototyping in Sketch feature which lets you see the entire flow in action. (Image credit)

General

  • Ed Ellson examined Chrome’s Background Sync API and the retry strategy it uses to perform a request. By allowing synchronization in the background after a first attempt has failed, the API helps us improve the browsing experience for users who go offline or are on unstable connections.

UI/UX

Cheating At Design
Use color and weight to create visual hierarchy instead of size is only one of the seven practical tips for cheating at design that Adam Wathan and Steve Schoger share. (Image credit)

Security

  • With GraphQL you can query exactly what you want whenever you want. This is amazing for working with an API but also has complex security implications. Instead of asking for legitimate, useful data, a malicious actor could submit an expensive, nested query to overload your server, database, network, or all of these. To prevent this from happening, Max Stoiber shows us how we can secure the GraphQL API in our projects.

Privacy

  • WebKit is introducing the Storage Access API. The new API targets one of the major issues with Safari’s Intelligent Tracking Protection (ITP): Identifying users who are logged in to a first-party service but view content of it embedded on a third party (YouTube videos on a blog, for example). The Storage Access API allows third-party embeds to request access to their first-party cookies when the user interacts with them. A good solution to protect user privacy by default and allow exceptions on request.

Web Performance

  • Janos Pasztor built his own Content Delivery Network because he thinks it can be a better solution than using existing third parties. The code for the CDN of his personal website is now available on Github. A nice web performance article that looks at common solutions from a different angle.
  • A year after Facebook’s announcement to broadly use Cache-Control: Immutable, Paul Calvano examined how widespread its usage is on the web — apart from the few big players. Interesting research and it’s still sad to see that this useful performance tool is used so little. At Colloq, we use it quite a lot, which saves us a lot of traffic and load on our servers and enables us to serve a lot of pages nearly instantly to recurring users.
Global stats of a self-built CDN
Global stats for the custom CDN that Janos Pasztor built. (Image credit)

HTML & SVG

JavaScript

CSS

Accessibility

Accessibility Checklist
The Accessibility Checklist helps build accessibility into your process no matter your role or stage in a project. (Image credit)

Work & Life

  • This week I read an article by Alex Duloz, and his words still stick with me: “When we develop a new application, when we post content on the Internet, whatever we do that people will have access to, we should consider just for a minute if our contribution adds up to the level of dumbness kids/teenagers are exposed to. If it does, we should refrain from going live.” The truth is, most of us, including me, don’t consider this before posting on the Internet. We create funny things, share funny pictures and try to get fame with silly posts. But in reality, we shape society with this. Let’s try to provide more useful resources and make the consumption of this more enjoyable so young people can profit from our knowledge and not only view things we think are funny. “We should always consider how teenagers will use what we release.”
  • The MIT OpenCourseWare released a lot of free audio and video lectures. This is amazing news and makes great content available to broader masses.
  • Jake Knapp says great work requires idealism and cynicism and has strong arguments to back up this theory. An article worth reading.
  • There’s an important article on how unhappiness has grown in America’s population since around the year 2000. It reveals that while income inequality might play a role, the more important aspect is that young people who use a lot of digital media are unhappier than those who use it only up to an hour a day. Interestingly, people who don’t use digital media at all, are unhappy, too, so the outcome of this could be that we should try to use digital media only moderately — at least in our private lives. I bet it’ll make a big difference.
  • Following the theory of Michael Bradley, projects don’t necessarily need a roadmap for success. Instead, he suggests to create a moral compass that points out why the project exists and what its purpose is.

Going Beyond…

We hope you enjoyed this Web Development Update. The next one is scheduled for April 13th. Stay tuned.

Excerpt from – 

Monthly Web Development Update 3/2018: Service Workers, Building A CDN, And Cheating At Design

Monthly Web Development Update 2/2018: The Grown-Up Web, Branding Details, And Browser Fast Forward

Every profession is a wide field where many people find their very own, custom niches. So are design and web development today. I started building my first website with framesets and HTML4.0, images and a super limited set of CSS, and — oh so fancy — GIFs and inline JavaScript (remember the onclick=”” attribute?) about one and a half decades ago. It took me four days to learn the initial, necessary skills for that.

Original article:

Monthly Web Development Update 2/2018: The Grown-Up Web, Branding Details, And Browser Fast Forward

Using Media Queries For Responsive Design In 2018

Back in July 2010, I wrote an article for Smashing Magazine entitled “How To Use CSS3 Media Queries To Create A Mobile Version of Your Website.” Almost eight years on, that article still receives a lot of traffic. I thought it would be a nice idea to revisit that subject, now that we have layout methods such as Flexbox and Grid Layout. This article will take a look at the use of media queries for responsive design today, and also have a look at what is coming in the future.

Credit:  

Using Media Queries For Responsive Design In 2018

Automated Browser Testing With The WebDriver API

Manually clicking through different browsers as they run your development code, either locally or remotely, is a quick way to validate that code. It allows you to visually inspect that things are as you intended them to be from a layout and functionality point of view. However, it’s not a solution for testing the full breadth of your site’s code base on the assortment of browsers and device types available to your customers.

Excerpt from – 

Automated Browser Testing With The WebDriver API

Maximize Your Instagram Sales With Help From These Third-Party Platforms

instagram money

The recent explosion in Instagram hype is capturing the attention of marketers everywhere – especially those of us who are keeping tabs on trends in visual storytelling and social media-based audience building. As of this past fall, when the most recent official Instagram community stats were released, the social channel had over 800 million monthly active users, which makes it over twice as popular as Twitter. And the audience engagement here is also particularly strong – especially in certain niches, as data from Rival IQ demonstrates. Source Recently, the Facebook-owned platform announced the ability for multiple accounts to broadcast live…

The post Maximize Your Instagram Sales With Help From These Third-Party Platforms appeared first on The Daily Egg.

Link to article: 

Maximize Your Instagram Sales With Help From These Third-Party Platforms

Monthly Web Development Update 12/2017: Pragmatic Releasing, Custom Elements, And Making Decisions

Today I read an eye-opening article about the current young generation and their financial future. It’s hard to grasp words like “Millenials”, and there’s much talk about specific issues they face, but, for many of us, it’s not easy to understand their struggle — no matter if you’re older or younger than me (I qualify under the Millenial generation). But Michael Hobbes’ entertaining and super informative article revealed a lot to me.

Continued: 

Monthly Web Development Update 12/2017: Pragmatic Releasing, Custom Elements, And Making Decisions