Mastering XPath: Using the “following-sibling” and “preceding-sibling” Axes

The “following-sibling” and “preceding-sibling” axes in XPath allow you to traverse the HTML/XML document sideways to find the siblings of the current element. Let’s learn with some practical examples that can be used in Selenium.
by Josephine Loo · September 2023

Contents

    XPath, also known as XML Path Language, is an expression language used to navigate and select elements in an XML or HTML document. It provides a way to locate elements on a web page based on their tag name, attributes, position, text content, and more in the document’s hierarchy.

    Unlike CSS Selectors that only support one-directional flow for locating an HTML element (from parent to child), XPath supports bidirectional flow—traversal can be both ways.

    Besides that, it can also traverse sideways to find siblings of an element. Let's learn how you can do it with the following-sibling:: and preceding-sibling:: axes!

    XPath Basics: Understanding XPath Axes

    XPath uses a path notation that is similar to URLs to navigate the hierarchical structure of an HTML/XML document for finding an element/node. Each step in the XPath expression is separated by a slash ( “/” or “//” ), which is the fundamental of the many axes available.

    In XPath, an axis is used to define the relationships between elements in an HTML/XML document. It allows you to navigate the document structure in a specific direction or pattern and select elements based on their position or relationship to other elements.

    The slashes ( “/” or “//” ) traverse down the document to select the child or descendant elements of the current element.

    Axis Description Example
    / Selects all child elements of the current element. /html/body/h1 - Selects the <h1> element that is a direct child of the <body> element.
    // Selects all descendant elements of the current element, regardless of their depth. /html/body//h3 - Selects all <h3> elements anywhere in the document that are descendants of the <body> element.

    Besides traversing down, you can also traverse the document sideways, using axes like following-sibling:: and preceding-sibling::.

    What is the “following-sibling” Axis

    In XPath, following-sibling:: is an axis used to select all sibling elements that appear after the current element. By way of comparison, it is like finding the "younger siblings" of the current element in a family tree.

    This is the syntax for using the following-sibling:: axis:

    /parent/current-element/following-sibling::target
    

    Let’s see some practical examples following the HTML snippet below:

    <div>
        <p>Element 1</p>
        <p>Element 2</p>
        <a href="https://www.browserbear.com">Browserbear</a>
        <p class="last_paragraph">Element 3</p>
    </div>
    

    screenshot of the demo page

    Suppose you want to select all elements that come after the first <p> element, you can use the following-sibling:: axis with a wildcard (*), as shown in the XPath expression below:

    //p[1]/following-sibling::*
    

    the following siblings highlighted in a red box

    If you specifically want to select only the <p> elements, use the following-sibling:: axis with tag name p:

    //p[1]/following-sibling::p
    

    the following <p> elements highlighted in a red box

    The same applies for selecting the <a> element:

    //p[1]/following-sibling::a
    

    ![the following element highlighted in a red box](/images/ghost/2023-09-13-mastering-xpath-using-the-following-sibling-and-preceding-sibling-axes/3.png)

    You can also use a predicate to select an element more precisely. The XPath expression below selects only the <p> elements that follow the first one AND with the class name last_paragraph:

    //p[1]/following-sibling::p[@class="last_paragraph"]
    

    Using the XPath expression above, only the third <p> element will be selected.

    What is the “preceding-sibling” Axis

    On the other hand, the preceding-sibling:: axis is used to select all sibling elements that appear before the current element. Think of it as finding the "older siblings" of the current element in a family tree.

    This is the syntax for using the preceding-sibling:: axis:

    /parent/current-element/preceding-sibling::target
    

    Let’s use the same HTML snippet for some examples but with the last (third) <p> element as the current element.

    To select all elements that come before the last <p> element, you can locate it by its class name and select the preceding siblings using the wildcard:

    //p[@class="last_paragraph"]/preceding-sibling::*
    

    the preceding elements highlighted in a blue box

    Similarly, if you specifically want to select only the <p> elements,  specify the tag name after preceding-sibling:::

    //p[@class="last_paragraph"]/preceding-sibling::p
    

    the preceding <p> elements highlighted in a blue box

    It goes the same for the <a> element:

    //p[@class="last_paragraph"]/preceding-sibling::a
    

    ![the preceding element highlighted in a blue box](/images/ghost/2023-09-13-mastering-xpath-using-the-following-sibling-and-preceding-sibling-axes/7.png)

    Besides using the predicate with a class name like the previous example, we can also use other attributes or functions. For example, we can use a function like contains() to find elements that contain specific text.

    The XPath expression below finds preceding sibling that is a <p> element and contains the text "1":

    //p[@class="last_paragraph"]/preceding-sibling::p[contains(text(), '1')]
    

    This will return only the first <p> element with text “Element 1”.

    "Element 1" highlighted in a blue box

    Example: Using the “following-sibling” and “preceding-sibling” Axes in Selenium (Python)

    XPath is one of the most versatile methods to locate an HTML element from a web page when using Selenium. Similar to using any XPath expression to locate an HTML element from a web page, start by navigating to the target URL and find the element using the find_elements() method.

    Let’s find the steps preceding and following “Step 3. Navigate to the Target Web Page” from this page:

    a screenshot of the web page showing the list of steps

    Get the XPath that refers to "Step 3. Navigate to the Target Web Page" from the browser inspector:

    a screenshot of the web inspector

    Note: For the ease of locating the element, we're using the <h3> element instead of the <li> element in the content menu.

    This is the XPath:

    //*[@id="step-3-navigate-to-the-target-web-page"]
    

    To find the steps before Step 3, use the preceding-sibling:: axis to find preceding <h3> elements that contain the word “Step”:

    from selenium import webdriver
    from selenium.webdriver.common.by import By
    
    driver = webdriver.Chrome()
    
    # Navigate to the target web page
    driver.get("https://www.browserbear.com/blog/how-to-find-elements-by-xpath-in-selenium")
    
    # Find element
    steps = driver.find_elements(By.XPATH, "//*[@id='step-3-navigate-to-the-target-web-page']/preceding-sibling::h3[contains(text(), 'Step')]")
    
    for step in steps:
        print(step.text)
        
        # Step 1. Install the Selenium Package
        # Step 2. Install the Browser Drivers
    

    To find steps after Step 3, change the axis to following-sibling:::

    from selenium import webdriver
    from selenium.webdriver.common.by import By
    
    driver = webdriver.Chrome()
    
    # Navigate to the target web page
    driver.get("https://www.browserbear.com/blog/how-to-find-elements-by-xpath-in-selenium")
    
    # Find element
    steps = driver.find_elements(By.XPATH, "//*[@id='step-3-navigate-to-the-target-web-page']/following-sibling::h3[contains(text(), 'Step')]")
    
    for step in steps:
        print(step.text)
        
        # Step 4. Find the Element by XPath
        # Step 5. Interact with the Element
        # Step 6. Close the Web Driver
    

    XPath vs. CSS Selectors: Which is Better

    Both XPath and CSS Selectors are commonly used to locate an HTML element on a web page. To decide which one to use, here are some factors that you should consider:

    • Selector availability - CSS Selectors can locate an element directly using its ID, class, name, and other attributes. However, they might not always be available. In this case, you can use XPath to locate an HTML element based on its position in the document hierarchy.
    • Readability - Compared to using an absolute XPath that starts from the root element of the document and navigates down the hierarchy of the elements until the target element is found, CSS selectors are shorter. This will keep your code cleaner and easier to read.
    • Project requirements - Not all HTML elements can be located using CSS selectors. For example, you might need to find an element based on its text content in some cases. However, you can’t match an HTML element based on its text content using a CSS selector but you can do it in XPath, eg. //h1[text()='Welcome'].
    • Directional flow - CSS Selectors support one-directional flow for locating an HTML element, traversing elements from parent to child. On the other hand, XPath supports bidirectional flow and sideways traversal.

    While CSS Selectors are simpler and easier to use, XPath is more powerful and provides more advanced features with built-in functions like text(), position(), and contains().

    In real use cases, it's important to note that you're not limited to using just one method. Based on your project requirements, consider utilizing both approaches so that you can maximize their benefits and achieve optimal results!

    About the authorJosephine Loo
    Josephine is an automation enthusiast. She loves automating stuff and helping people to increase productivity with automation.

    Automate & Scale
    Your Web Scraping

    Browserbear helps you get the data you need to run your business, with our nocode task builder and integrations

    Mastering XPath: Using the “following-sibling” and “preceding-sibling” Axes
    Mastering XPath: Using the “following-sibling” and “preceding-sibling” Axes