WebDriver BiDi – Future of Browser Automation

Sri Harsha

Posted On: January 24, 2025

view count6092 Views

Read time6 Min Read

Test automation has become essential to ensuring product quality, particularly as web technologies continue to advance. With the evolution of browsers, the tools and methods used for automation have also grown more sophisticated.

Among the key protocols developed for browser automation are the WebDriver Classic and Chrome DevTools Protocol (CDP). CDP was initially introduced for Chrome’s DevTools capabilities but was later adapted for automation tasks, offering deep integration and fine-grained control over browser functions. These two protocols now serve as the backbone of modern browser automation strategies.

WebDriver Classic

WebDriver Classic is a W3C standard, and is universally supported by all major browser vendors. Client bindings send commands over HTTP to a driver server, which then communicates via a browser-specific protocol. WebDriver Classic closely mirrors user interactions, providing reliable results that accurately reflect real-world scenarios. Because it is a W3C specification, there is consistency and interoperability across different browsers and platforms. While WebDriver Classic has been an excellent choice for cross-browser testing, it comes with certain caveats.

  1. Unidirectional Protocol: WebDriver is implemented using a “request/response” model. This means that the driver will wait until an action is complete before executing the next.
  2. Low-level controls: WebDriver relies on other W3C standards extending the protocol. While many have done this, not all have, and implementation of these extensions are not always done simultaneously in all browsers.
  3. WebDriver relies on other W3C

Image Source — https://developer.chrome.com/blog/test-automation-evolution

Chrome DevTools Protocol (CDP)

On the other hand, Chrome DevTools Protocol (CDP) was originally introduced by Google as part of their Chrom(ium) browser, and was created as part of their integrated DevTools work. It is optimized for working in the same process as the browser. Because it was initially designed for DevTools work, it enables network monitoring to identify performance bottlenecks and offers performance profiling for analyzing load times and memory usage.

Later CDP was adopted by tools like Puppeteer and Playwright for browser automation, network interception, and performance testing.

Unlike WebDriver classic, CDP uses WebSockets for communication, and the protocol is very “chatty” on the wire. In some circumstances, this may be more efficient than WebDriver Classic’s request/response model, but because it is more verbose it may not always be faster. The main advantage WebSockets offers is that the browser may now send events to the test.This lets you capture console messages and intercept network requests more easily.

Chrome DevTools Protocol (CDP)

Image Source — https://developer.chrome.com/blog/test-automation-evolution

So, What potential challenges come with using CDP for browser automation?

Browser-Specific: CDP is limited to Chromium-based browsers like Chrome and Edge, restricting cross-browser compatibility for testing non-Chromium browsers.

Fragmented API: Different browsers expose different protocols, resulting in fragmented automation experiences across browsers.

Version Specific: Because it was designed as a DevTools protocol, to be used in-process by a specific version of a specific browser, CDP makes no forwards or backwards compatibility guarantees. That is, testers need to match up the browser version and the CDP version carefully.

Lack of Standardization: CDP is not a W3C standard, making it highly dependent on the browser version. This can lead to inconsistencies, as protocol features may vary between browser releases.

WebDriver BiDi

WebDriver BiDi (Bidirectional) is a new approach to browser automation that overcomes the challenges of both WebDriver Classic and CDP. Unlike the previous protocols, BiDi allows for two-way communication between your automation scripts and the browser, which means you can get immediate feedback on events like network requests, console logs, and JavaScript errors.

The big advantage of WebDriver BiDi is that it’s designed to work across all major browsers, making it truly cross-platform. Since it’s a W3C standard, it’s also more stable and future-proof than CDP, which has always been tied to specific versions of Chromium-based browsers like Chrome and Edge. With BiDi, you get the best of both worlds: the features of WebDriver Classic and the low-level control of CDP.

The Shift to Event-Driven Architecture

The event-driven architecture helps tools to communicate with browsers in both ways, sending commands and receiving real time events. This architecture can be beneficial in browser automation due to the following factors:

Immediate Feedback: Automation scripts can listen for browser events (like network requests, DOM updates, or user interactions) as they happen, allowing for more responsive and dynamic test cases.

Asynchronous Event Handling: No more polling for conditions—scripts can react to events as they happen, improving efficiency and reducing overhead.

Enhanced Debugging and monitoring: Event driven architecture can stream console logs, network activity, and Javascript exceptions making it easy for developers to debug issues.

Embrace WebDriver BiDi with Selenium and WebdriverIO

Industry leading automation tools like Selenium and WebdriverIO have quickly adopted support for the bi-directional protocol.

Selenium, which is a majorly used tool for browser automation, now includes BiDi support starting from version 4.1.2. This integration helps developers and testers to get real-time browser feedback and offers better insights into network activities, console and javascript logs and performance metrics.

WebdriverIO, a powerful and extensible Node.js-based framework, now introduces WebDriver BiDi support in its latest release – WebdriverIO v9. With BiDi integration, WebdriverIO can automate real-time events, intercept network requests, and capture console logs, all while maintaining cross-browser compatibility.

Using WebDriver BiDi with Selenium

Now let’s move on to some practical code examples that show how you can use WebDriver BiDi with both Selenium and WebdriverIO. These examples demonstrate how you can monitor network requests and capture console logs in real time.

Listening to console events in Selenium:

Listening to console events in Selenium

Link to full code example

Execute a script in a specific browsing context:

Execute a script in a specific browsing context

LInk to full code example

Link to more selenium code samples : https://github.com/harsha509/LT-Selenium-BiDi

Using WebDriver BiDi with WebdriverIO:

Listening to console events:

Using WebDriver BiDi with WebdriverIO

Link to wdio code samples: https://github.com/harsha509/wdio-bidi-tests

Behind the scenes, WebdriverIO v9 automatically establishes a BiDi connection at the start of a session. This connection is leveraged for several key actions, including:

  • Element retrieval
  • Browser emulation
  • Viewport Configuration

Conclusion

WebDriver BiDi is shaping the future of browser automation by addressing limitations found in both WebDriver Classic and CDP. Its cross-browser compatibility, coupled with real-time, bidirectional communication, provides a more robust and efficient way to automate modern web applications. With leading tools like Selenium and WebdriverIO already integrating BiDi, teams can now capture network activities, console logs, and other real-time events seamlessly across different browsers. As the industry continues to evolve, adopting WebDriver BiDi is a logical step toward achieving faster, more responsive, and comprehensive browser automation.

Reference Links:

Author Profile Author Profile Author Profile

Author’s Profile

Sri Harsha

Sri Harsha is a Test Automation Enthusiast with nearly a decade of experience in automation and development. His expertise lies in testing applications using JavaScript and contributing to open-source platforms. His dedication has earned him roles as a committer for projects like Selenium and WebdriverIO and a member of the Technical Leadership Committee at SeleniumHQ. He is currently working as an Engineering Manager (OSPO) at LambdaTest.

Blogs: 1



linkedintwitter