Next-Gen App & Browser Testing Cloud

Trusted by 2 Mn+ QAs & Devs to accelerate their release cycles

Start free with Google Start free with Email

Join

Thought Leadership AI

Curing the Flaky Test Headache with AI-powered Testing Tools

Ken Hardin

Posted On: March 11, 2024

28358 Views

10 Min Read

Home
>
Blog
>
Curing the Flaky Test Headache with AI-powered Testing Tools

Nothing can ruin a developer’s day as quickly as learning their latest code push has resulted in a flaky test that they now have to troubleshoot. The test ran successfully on its first pass, but then reported a critical issue on a second iteration, or when it ran on another system or an integrated testing environment.

Nothing’s changed with the code, so there’s no good reason for the failure. But there’s a big red X on the testing dashboard, and developers and QA engineers are faced with a costly, frustrating bug hunt.

Flaky tests undermine every aspect of software development and testing, from poor-quality releases to high staff turnover. And they are a headache that virtually every software team faces weekly, if not daily.

In this post, we’ll look at the common types and causes of flaky tests, as well as ways AI testing tools can finally help resolve this pernicious and costly issue.

Everybody Hates Flaky Tests

Flaky tests – also often referred to as unreliable or non-deterministic tests – are so commonplace that smart enterprise development teams have put in protocols on how to handle them. (More on this in a bit.) Even so, many developers and testers treat flaky tests as an unavoidable cost of doing business.

In fact, flaky tests were one of the major concerns cited in a recent survey our team at LamdaTest recently conducted of more than 1,600 QA professionals from around the globe. All told, they told us they spend about 8 percent of their time on flaky tests. That’s almost as large a time commitment (10.4 %) as they make to setting up[ and maintaining test environments.

A recent survey found that large corporations, including Microsoft and Google, were seeing flakiness in their testing as often as 41 percent of the time. These tests were extremely flaky – about half of the jobs that failed ran successfully on a second attempt after being restarted manually. Our survey showed that more than 24% of large organizations get non-deterministic results on more than 5% of their tests.

And that doesn’t take into account the toll that delays and frustration can have on your team dynamics and overall efficiency.

The Types of Flaky Tests

To develop a plan for tackling flaky tests, you first need to understand the general categories in which they occur.

Random: Just like it sounds. A test fails; you manually restart it; and it completes successfully, with no changes in code or environment. Or it fails and then succeeds on a second run. This is probably the most common kind of test flakiness, and it can drive you crazy as you hunt down possible causes.

Environmental: A test may work on your machine but fails on another system. Or, perhaps more troubling, code fails in a continuous integration (CI) environment.

Branch: The test succeeds within the featured application branch, but then fails when merged into the main. These are slightly less maddening than purely random fails – at least you have a starting point to hunt for inconsistencies and conflicts.

Flaky Tests Hurt Every Aspect of Software Development

We can’t stress this enough – everybody hates flaky tests. This article at Infoworld reflects the industry-wide focus of elevating the search for a solution to what’s become just part of the daily grind for most developers.

Why are flaky tests so awful?

They drain resources: Writing, executing, and re-running tests consume a huge chunk of developers’ time – in some cases, testing consumes more of their time than actually writing code. The benchmark survey we cited earlier said 77 percent of developers said flaky tests are a time-consuming part of their work cycle that draws them away from feature development and ideation.

They delay releases: Resolving unpredictable test results can push back milestones and completely derail a project timeline. That’s good for no one.

They can really stress out developers: It’s hard to retain talent as it is. The turnover rate for software developers is hovering around 60% annually. It’s incredibly demoralizing to have to go back after a “successful” rollup and bug hunt. In fact, flaky tests have been cited as the number-one issue making mobile developers nuts.

Can undermine the perceived value of the test suite and team: Like everybody else in IT, testing units have to justify the expense of their tools and talent. Compiling a lot of false “fails” in your testing program is a sure way to damage your reputation. Shopify famously lifted the pass rate on their extremely complex development channels to well over 90% by just re-running initially failed tests that had flaked out.

They can lead to buggy releases: If developers and testers don’t trust test results, they may get in the habit of simply pushing out code that has legitimate issues.

Why Do They Keep Happening?

So, why do carefully written tests flake out? These are the most common issues that cause non-deterministic tests.

Process timing / timeouts: Testers often write in sleep statements to allow time for the application to complete a request. If the application takes longer than expected, the test will fail. This is particularly tricky if the app is calling to an outside data store or resource that may not always perform as the tester expected. And, if the test environment is getting hammered, that can also slow responses to below the anticipated level.

Concurrency: Tests often expect an EXACT sequence of events, even though the code is written to allow for multiple execution sequences, particularly if different threads are handling the actions. If the test can accept only one sequence, it will fail.

Test order dependency: If testing changes data stores, memory or other aspects of the environment, running subsequent tests out of sequence will ensure failure. Your tests must be able to run independently in any order, as well as clean up after themselves to ensure stability for the next test.

Exotic code/environment conditions: Sometimes coders or environment admins create conditions (intentionally or unintentionally) that are hard to anticipate in test design, but can cause a test to fail under anomalous conditions. Everyone’s heard of the 500-mile email problem at this point – a server upgrade implemented an extremely tight timeout window that was causing emails to fail. This is rare, but it can happen.

Bad tests: Human error is always a real risk. Solid tests should include assumptions that cover all operational bases, as well as measures to enforce those assumptions. Edge cases do happen, and you need to test for them.

Flaky tests can be caused by various factors, including process timing, concurrency, test order dependency, and exotic code or environment conditions, as seen above. these issues can lead to non-deterministic behaviour in tests, making them unreliable and frustrating for developers to deal with.

To understand engineer’s perspectives on flaky tests, watch the complete video tutorial and see how these perspectives can help handle and resolve flaky test issues effectively.

Youtube Thubnail

How to Tackle Flaky Tests

As we mentioned earlier, flaky tests are such a big problem for software developers and testers that most enterprise teams have developed some protocols for handling them.

Here, Mythili Raju and Harshit Paul have laid out a framework for flaky test mitigation in our LamdaTest Learning Hub, which we strongly recommend you read.

In summary, we suggest that your team:

Implement consistent test retry mechanisms to ensure that a “fail” is a fail
Regularly maintain and update your tests
Encourage open communication between developers and testers
Set and monitor test performance metrics and KPIs
Create a weighting scale to prioritize the resolution of flaky tests based on business value
Constantly gather and analyze test performance data

It’s with the last initiative, analyzing testing data, where the emerging category of AI and machine learning testing tools can have an enormous impact on not only resolving flaky tests but also preventing them from happening in the first place.

AI-Based Tools and Flaky Test Detection

AI tools are great at identifying flaky tests, allowing developers and project managers to prioritize legitimate code errors that can be remediated and advanced toward release while moving flaky test results along a different path.

AI and machine learning can also analyze test results to find the underlying issues, such as environmental factors, that can contribute to ongoing flakiness.

The major strengths of AI are:

Root Cause Analysis: In addition to flagging flaky tests, AI-based tools can parse your testing logs to recognize patterns in non-deterministic results, most often with environmental factors. With these insights, you can resolve timing and resource issues that are derailing your testing efforts. (Of course, AI pattern recognition can also find recurring code errors, but that’s not really “flaky” – it’s just another way that AI tools can improve your software pipeline.)

Adaptive Test Maintenance: AI tools can sort out outdated or unnecessary use cases, which majorly contribute to overall “flakiness” in your testing suite.

Predictive Analytics: AI tools can help development teams avoid flaky environmental factors, based on historical data and fixed deltas you provide the system. No flaky errors, no time wanted.

Continuous Improvement: AI-based tools can support your team as they refine testing strategies for both the near and long term. Software testing should be both comprehensive and efficient, and Big Data analysis of your testing logs can keep you on that path.

AI Helps Tackle Flaky Tests at the Root

Flaky tests are a serious drain on your development and testing teams’ resources and morale. A solid remediation plan, including AI-powered testing suite tools to diagnose and resolve underlying issues that cause flaky tests, is essential in keeping your projects on schedule and up to the quality standards you demand.

To conquer your flaky tests, tools like LambdaTest’s Test Intelligence help your teams to take data-driven actions in identifying, resolving, and preventing flaky tests. By leveraging machine learning and intelligent analysis, LambdaTest aims to enhance the reliability and effectiveness of automated testing, resulting in more robust software delivery.

Flaky Test

Ken Hardin

Ken Hardin is an experienced business analyst and executive team leader with a demonstrated history of success in the internet industry. Ken was a key member of the startup teams for both TechRepublci.com and ITBusinessEdge.com. Since 2010, he has served as the Principal Analyst for Clarity Answers LLC, which provides business guidance and project management services.

See author's profile

Author’s Profile

Ken Hardin

Ken Hardin is an experienced business analyst and executive team leader with a demonstrated history of success in the internet industry. Ken was a key member of the startup teams for both TechRepublci.com and ITBusinessEdge.com. Since 2010, he has served as the Principal Analyst for Clarity Answers LLC, which provides business guidance and project management services.

Blogs: 3

Got Questions? Drop them on LambdaTest Community. Visit now

Related Articles

Related Post

How to Recognize and Hire Top QA / DevOps Engineers

Lejla Hadzimahovic

February 28, 2025

31945 Views

11 Min Read

Thought Leadership |

Related Post

AI-Powered QA: How Large Language Models Are Revolutionizing Software Testing- Part 3

Ilam Padmanabhan

February 21, 2025

9071 Views

7 Min Read

Thought Leadership |

Related Post

30+ Best AI/ChatGPT Prompts for Software Testing

Harish Rajora

February 12, 2025

67345 Views

13 Min Read

AI | Automation | Manual Testing |

Related Post

AI-Powered QA: How Large Language Models Are Revolutionizing Software Testing- Part 2

Ilam Padmanabhan

January 28, 2025

28488 Views

7 Min Read

Thought Leadership |

Related Post

AI-Powered QA: How Large Language Models Are Revolutionizing Software Testing- Part 1

Ilam Padmanabhan

29386 Views

13 Min Read

Thought Leadership |

Related Post

Top 21 AI Testing Tools for 2025

Zikra Mohammadi

January 27, 2025

219285 Views

20 Min Read

Automation | AI |

Try LambdaTest Now !!

Get 100 minutes of automation test minutes FREE!!

Start free with Google Start free with Email

Join

Cookie

X

We use cookies to give you the best experience. Cookies help to provide a more personalized experience and relevant advertising for you, and web analytics for us. Learn More in our Cookies policy, Privacy & Terms of service.

Allow Cookie Cancel