How to Automate Browser Actions Using Selenium

Selenium is highly extensible, with a large ecosystem of third-party plugins and frameworks that can enhance its functionality, making it a go-to tool for browser automation. In this article, we’ll see how to automate browser actions using Selenium.
by Josephine Loo · April 2023

Contents

    With the increasing amount of time people spend online and the growing complexity of web applications, browser automation has become an important tool for businesses and individuals who want to save time and increase productivity, including developers.

    Browser automation can be used for a wide range of purposes, from web scraping and data mining to testing and debugging web applications. It allows you to perform repetitive tasks quickly and accurately without the need for human intervention. Moreover, it helps reduce errors and increase the efficiency of web-based processes.

    There are many browser automation tools available and Selenium is one of the most popular ones used by developers. In this article, we will explore how to use Selenium to automate browsers.

    What Can You Do with Browser Automation

    Browser automation can significantly reduce the time and effort required for repetitive tasks. You can also use it to:

    Test Applications

    Automated tests can quickly and efficiently validate that your application works correctly across multiple browsers and platforms. You can run tests repeatedly and catch bugs that could break your application during the development process easily with browser automation.

    Monitor Websites

    Browser automation can be used to monitor websites for changes or errors to ensure that your site is always up and running. By using a scheduling tool such as cron-job.org to run your automation script at regular intervals, you can monitor the website over time and track changes.

    Scrape Websites

    Automated web scraping can be used to gather data from websites, which can be used for a variety of purposes, such as market research or lead generation. This can save a significant amount of time and effort compared to manual scraping especially when the amount of data is vast.

    Enter Data

    Browser automation can be particularly useful for tasks that involve entering the same data repeatedly, such as filling out a web form or entering data into a spreadsheet. Additionally, it can also help reduce errors and improve accuracy by ensuring that data is entered consistently and correctly.

    What is Selenium

    Selenium is a widely-used browser automation tool that has been around since 2004. It is mature and supports a wide range of web browsers and programming languages including non-mainstream ones like Haskell, Perl, Dart, and Pharo Smalltalk.

    Selenium consists of Selenium WebDriver, Selenium IDE, and Selenium Grid. Each of them complements the other to carry out automated tests on different browsers and environments, with Selenium WebDriver being the core of the automated testing ecosystem. It is also highly extensible, with a large ecosystem of third-party plugins and frameworks that can enhance its functionality.

    As Selenium is an open-source project, its growth is contributed by the support of the community. Besides efforts by individual contributors like programmers, designers, QA engineers, etc., the Selenium project is also sponsored by companies like BrowserStack, Lambdatest, Sauce Lab, and more.

    How to Use Selenium for Browser Automation

    The basic steps to use Selenium for browser automation in different programming languages are similar. We’ll use Nodej.s/JavaScript as an example:

    Step 1. Install the Selenium Library

    After creating a new project, run the command below in the project directory to install the Selenium library:

    npm install selenium-webdriver
    

    The library is also available for other programming languages including Java, Python, C#, Ruby, JavaScript, and Kotlin. The instructions for installing the library for other programming languages are available on the Selenium website.

    Selenium library for different programming languages.png

    ⚠️ Note: Despite the name of the library contains "webdriver", the web driver has to be installed separately.

    Step 2. Install Browser Drivers

    To use Selenium to automate the browsers, we need to install appropriate drivers for different browsers. Selenium currently supports all major browsers such as Chrome/Chromium, Firefox, Microsoft Edge, Safari, and Opera.

    There are a few methods to install the drivers. The easiest method is to download them from the official Selenium website and configure Selenium to use the specified drivers using one of the options below:

    Option 1: Save the Driver Location in the PATH Environment Variable

    You can place the drivers in a directory that is already listed in PATH or add the drivers’ location to PATH. Run the commands below in the Terminal or Command Prompt to save them to the PATH environment variable:

    saving the web driver's location in the PATH environment.png

    In your code, import the library and create a new instance of the driver:

    const { Builder } = require('selenium-webdriver');
    
    const driver = await new Builder().forBrowser('chrome').build();
    

    Option 2: Specify the Driver Location in Your Code

    You can also hardcode the driver’s location if you want to save the hassle of figuring out the environment variables on your system. However, this might make the code less flexible as you will need to change the code to use another browser.

    In your code, import the library. Then, specify the location of the driver and create a new instance of the driver:

    const {Builder} = require('selenium-webdriver');
    const chrome = require('selenium-webdriver/chrome');
    
    const service = new chrome.ServiceBuilder('/path/to/chromedriver');
    const driver = new Builder().forBrowser('chrome').setChromeService(service).build();
    

    For other programming languages, refer to the Selenium website.

    hard code driver location.png

    Step 3. Write the Code

    The basic flow when using Selenium to interact with the browser and perform various actions is as follows:

    1-Navigate to a web page

    await driver.get('https://www.browserbear.com/');
    

    2-Wait for the page to load completely

    Before interacting with an HTML element, we want to ensure that the page or the target element has been completely loaded. There are a few ways to wait for it, including:

    a. Explicit Wait - waits for a specific condition to be met

    let element = await driver.wait(until.elementLocated(By.css('p')),10000);
    
    // Proceed with the code after the condition above is met
    

    b. Implicit Wait - allows the driver to poll the DOM to find any element for a certain duration

    // Apply timeout for 10 seconds
    await driver.manage().setTimeouts( { implicit: 10000 } );
    

    c. Fluent Wait - defines the maximum time to wait for a condition and the frequency to check it again

    let element = await driver.wait(until.elementLocated(By.id('username')), 30000, 'Timed out after 30 seconds', 5000);
    
    // Wait 30 seconds for an element to be present on the page
    // Check its presence every 5 seconds
    

    3-Find an HTML element

    Then, get the target HTML element using one of the locator strategies below:

    • Class name
    • CSS selector
    • ID
    • Name
    • Link text
    • Partial link text
    • Tag name
    • XPath

      let textBox = await driver.findElement(By.name('text-box')); let button = await driver.findElement(By.css('button')); let input = await driver.findElement(By.xpath('//input[@value='f']'));

    Read more about XPath here: What is XPath in Selenium.

    4-Perform Actions

    Interact with the HTML element with these actions:

    • click
    • send keys (text fields and content editable elements only)
    • clear (text fields and content editable elements only)
    • submit (form elements only)
    • select

      await button.click();

    5-End the session

    After performing all actions, end the driver process. This will close the browser automatically.

    await driver.quit();
    

    Step 4. Run Your Code

    After completing the code, you can run it by executing node index.js and see the browser automation in action.

    🐻 Bear Tips: You can also run the browser automation in headless mode.

    Other Selenium Alternatives

    While Selenium is a popular choice for browser automation, it is not the only option. Browserbear is another powerful tool for automating browser actions. It offers a more user-friendly and intuitive interface than Selenium, making it easier to create and run automated browser tasks. Additionally, it is cloud-based, which means you can scale your automation as much as you need easily.

    The image below shows an example of a Browserbear automation task that extracts job data from a job board.

    Browserbear dashboard example.png

    You can duplicate it and other ready-made tasks from the Task Library to start automating browser actions immediately. To learn how to use Browserbear, read:

    Conclusion

    Browser automation is essential for anyone who wants to streamline their online activities and reduce the time and effort required to perform repetitive tasks on the web. Regardless of the tool chosen, the benefits of browser automation make it an indispensable tool for businesses and individuals who want to save time and increase productivity in today's fast-paced world.

    About the authorJosephine Loo
    Josephine is an automation enthusiast. She loves automating stuff and helping people to increase productivity with automation.

    Automate & Scale
    Your Web Scraping

    Browserbear helps you get the data you need to run your business, with our nocode task builder and integrations

    How to Automate Browser Actions Using Selenium
    How to Automate Browser Actions Using Selenium