Do You Trust Your Front-End Tests as Much as We Do?
Consider the following scenario: A critical bug has been detected on your web application overnight and a fix must be deployed quickly. How fast can it get to production during off hours?
How much do you trust your tests’ coverage? How long would it take to write a new test that verifies the fix?
The Engineering Team at Forter requires incredible agility – while we don’t have a QA team, we want to deploy any change to production immediately without sacrificing quality. To support this mindset, we began searching for an end-to-end web UI testing framework that we can trust.
During our research, we set up a few key requirements as “Must Haves”:
Technology Agnostic – Our web application stack is composed of different technologies within different tiers: We have Node.js and GraphQL running on the backend (Docker-ized) while the client uses Angular and RelayJS. We want to keep our technology options open without binding our stack to any specific testing framework. For example, should we decide to use VueJS or replace GraphQL, we wouldn’t want to have to start from scratch.
Minimal Learning Curve – Any member of our Product Engineering Team should be able to quickly set up and write their first test, even without prior automation experience. We expect that adding a new test should take no more than 20 minutes.
Low-Touch Solution – We prefer a platform that supports “record-replay” testing, so we won’t have complex scripts or configuration files. For us, it’s not about codeless tests, it’s about the readability of the test and operational maintainability.
Low Maintenance (False Negative Rate) – A common failure point for many Web UI tests are false negatives caused by minor UI changes. When it comes down to it, we simply don’t want to waste our time adjusting tests for every CSS or HTML modification. That’s terribly inefficient. Fragile tests must either be fixed or re-written. We cannot afford broken windows, and we cannot afford the high costs that come with test maintenance. Ideally, the testing solution should reduce test fragility to a minimum.
Support Parallelism and High Performance – As our products evolve, more tests are needed. We still want to be able to deploy as fast as possible as we don’t have a “nightly build” or anything similar. That means if a commit passed the code review and the tests – it is ready for deployment. We expect a scalable solution that allows for running more tests without dramatically increasing build time.
Support Our CI/CD Stack – Our CI/CD is Jenkins-based, running Docker containers. We want to keep our existing stack and we were looking for a solution that could support the pre-existing framework. Also, we didn’t want to expose our testing environment to the outside, therefore we sought a platform that could run in-house.
My Initial Thoughts…
I have 8 years of experience with various testing frameworks – Selenium, Protractor, QTP, truClient, and others. I was skeptical at first, because many of these frameworks wouldn’t answer our requirements – we really set the bar high.
Two years ago, we noticed that we kept bumping into Oren Rubin at many meetups. Oren is a well-known figure within Tel Aviv’s Front-End community, and he had just founded Testim.io. Back then, it was a small startup with a mission to automate and reduce Web UI test friction. We started a small POC with Testim, running 5 tests that would cover different complex flows.
Based on our two years of experience, this is what we have found:
Technology Agnostic – We upgraded Angular, NodeJS and other libraries with zero tests affected – as expected.
Minimal Learning Curve – Testim comes with “Record-Replay” on their in-browser studio, making it really easy and intuitive to build, edit, and run a test. The solution simply creates a test based on the actions performed by the user.
(“Record-Replay” example of the testim.io editor from their website)
The editor provides you with different types of actions to enhance the test, including: validations, waiting for an element, value extractions, sleep, and more.
Low Maintenance (false negative rate) – Of all the metrics, this one is my favorite. Testim has almost zero false negatives and a high tolerance for UI changes. In my experience, most of the frameworks I had previously used tended to fail more often. The traditional solutions save different DOM locators to detect HTML components, for example finding the “Login Button” by it’s CSS class or its xPath on the DOM. However, if the class name changes or another DIV is added to the login form, both selectors tend to fail.
Testim uses interesting strategies, looking for patterns in the DOM and analyzing multiple locators with the assumption that any single locator can change but the likelihood of multiple locators changing is much lower. Under the hood, Testim analyzes each test run and keeps track of locators that have changed. They maintain a per-element scoring mechanism which scores more stable locators higher, and ones that frequently change, lower. The result is a dynamic locator strategy that minimizes false negatives, which reduces test maintenance costs.
Support Parallelism and High Performance – At first, we had some difficulties running the self-hosted version of Testim in parallel mode. It took a few days of hard work, together with Testim’s customer support, before we were able to run our tests in parallel. This version required better hardware to run multiple Chrome instances in parallel on the same host. With 4CPUs and 16GB memory we could run three tests in parallel while with 8CPUs and 32GB memory we reached 12 parallel tests.
Support our CI/CD Stack – Jenkins slave starts the Testim’s Docker image and uses the JunitXMLReporter to produce standard Jenkins test reports.
During those two years our team grew from three to six engineers, while no dedicated QA team nor automation specialist was recruited. Many new features and modules were developed to support our business requirements.
Currently, we are running more than 30 complex UI tests for every commit that is merged to the master branch, providing us with higher coverage and much better quality. Our next goal is to increase coverage and reach 60 tests. We feel that Testim’s solution is robust enough to support Forter’s growth.
Our Testim test suite takes 7 minutes on average to run, which allows swift hotfix deployments, reducing the time our customers may be affected by a bug.
While Forter has tripled its customer base in the last twelve months, the number of bugs in the dashboard reported by users were dramatically reduced.
3 and a Half Tips for a Successful Testim.io Evaluation
1. Stateless Application – Keep user session state minimalistic. For example, if you keep the last open tab in the session state, running multiple tests in parallel might require a different user per test.
2. Group and Parameterize Login Steps – Testim allows the reuse of previously recorded steps. Every test starts with the login step. Having parameterized user credentials, one for each role, allows building tests with different types of users in minutes.
3. Tag Your Tests – Using Testim tagging features, we run all tests marked as ‘Production’ to ensure that we can easily toggle on/off specific tests by adding/removing the ‘Production’ tag in the Testim dashboard. While Testim tests have a low false negative rate, when it does happen you want to toggle off this test to unblock a critical deployment. We also tag our tests with feature names which enables the option of quick searches through the dashboard.
3.5. Test Only Stable Flows – Don’t waste time testing flows that are still WIP or might change in the near future. Sync with your Product Manager and your team members to verify there are no planned dramatic changes in the near future.