Many experts are ready to bet that every tester and developer has been excited about their first work with artificial intelligence. But romantic expectations can run into an impenetrable wall of reality—one day your team will have to test a self-learning algorithm.
In this article, you will find experiences and challenges that QA Leads can face. There will be a guide on how to set up the testing process so that everything goes as efficiently and comfortably as possible. The material will be helpful for everyone interested in introducing artificial intelligence into genuine software.
Let’s take a mobile dating app as an example. This is a hypothetical case, but it fully reflects all the difficulties encountered in the project. At the heart of the imaginary app is a self-learning algorithm that compares user data and, based on their unique characteristics, gives a verdict—whether people are right for each other and an estimate of the match as a percentage.
It was expected that the access to the artificial intelligence algorithm will be opened right away and, based on that, figure out how it worked. However, the Product Owner purchased this algorithm from a third party, so the client’s choice of an artificial intelligence solution was the sole wish. Thus, an NDA from the artificial intelligence provider appeared regarding collaboration. Since the customer had no right to share the code or the principles of the algorithm, this AI turned out to be a black box.
For the self-learning algorithm to work based on artificial intelligence, it must first be… right, taught! It had to be done manually with many cases and input data sets. Adequate results were expected from a program that can think. But in practice, the tool did not track the adequacy of the data that was set. Something had to be done about this, too.
So there was an application that could think but it wasn’t clear how it made decisions and claimed to be understood.
User questionnaires were submitted to the system. Then the system analyzed this data and, guided by its algorithms based on artificial intelligence (the same black box), gave the result of a match: refusal (no match detected) or presence of a match (the result was indicated in the ratio). Then the user joined in: he looked at the outcome offered to him by the program and approved or rejected it. If abandoned, the system had to respond to the user’s disagreement with her decision, remember it, and learn not to give more false answers. But exactly how the system rendered its verdict was unknown.
Testing planning
To make testing an AI-based system more accessible, pay attention to the following aspects even during the planning phase of the process:
- Ask the customer for as much test data as possible. The more valid input data you have, the easier it is to make an adequate data setup and start training the system.
- Put together a well-coordinated team. This may sound trivial, but this aspect can play a crucial role. AI-based systems are self-learning, so teaching your AI to do bad things when the team makes poor decisions is easy.
- Ask the Product Owner for the percentage of tolerance. This will save you a lot of time and nerves.
- Learn to think outside the box. Don’t try to anticipate the actions of artificial intelligence. Try to understand its non-standard logic, and then you can easily interact with it.
Technical Requirements and (non)Compliance With Them: How Should You Proceed?
Let’s go back to the example. The requirements did not mention anything about the algorithm by which the system makes decisions. The technical documentation only said general UI data. There was no description of the user flow—the scenario of user interaction with the system. User flow would have illustrated the order of actions necessary for the application to work correctly. The plan “User performs X action, the system responds with Y action… and everything works” was inadequate to what appeared in reality. The formulas in the documentation reflected the abstract principles of artificial intelligence. The documented integration algorithm wasn’t helpful either. In practice, it was necessary to figure out how the integration worked by running test cases and compiling statistics based on the results.
Here is the advice — when you read the development requirements, be prepared that they may have nothing to do with how things work. In that case, you will have to do many things manually.
Preparing for Testing
Before you start testing software with elements of artificial intelligence, make sure that you have:
- Reliable contact with the customer of the software. Only this person has a good idea of what he wants to get from the developed system.
- There is a specialist in the team who is responsible for integration. He will be able to influence the work of the software algorithm somehow.
- If possible, establish contact with the software developer. Communicating with them will make it much easier to understand what is written in the code.
- Learn user flow. If you don’t thoroughly understand where, how, and why your customer is implementing artificial intelligence, you won’t be able to understand precisely how to test it.
Testing the product
First, the absence of adequate integration can be noticed from the start, it did not work. The second reason for the outrageous behavior was a reasonably high percentage of system error (in this case, it ranged up to 50%), which in no way corresponded to the specified error in the technical documentation. And the third—negative checks and testing with limited values brought the opposite of the expected results.
Such systems are unsuitable for considering extreme parameters because they rely on average values. Therefore, the active use of tests based on marginal values gives the opposite result in the context of proper training of the program. Hence the conclusion: negative tests for self-trained algorithms should be minimized, reducing them to one or two.
During the testing, we highlighted several vital aspects. The advice is to consider these points in your work:
- Remember: artificial-intelligence-based systems are suitable for simulating an actual decision-making process. So when testing them, be guided by situations that can happen to the user as they interact with the product.
- Learn how to connect an extraneous AI tool with your application. Find out how the server-side functions and how data is indexed. This will help avoid defects.
- Constantly train your system. It’s designed to learn. Correctly select valid test data based on production server data. Analyze the result according to your logic, not according to dry instructions from the technical documentation. Keep in mind that negative tests should be used with caution. Otherwise, you can teach the system to “think” in the wrong way.
Before submitting the product for production, make sure that you comply with the following conditions:
- The client knows that artificial intelligence products cannot work “out of the box” perfectly. First, they have to be taught by training the system on valid input data.
- If the Product Owner has no such data, insist on beta testing and UAT, or user testing. Still, no amount of first-class test cases can replace the participation of real people in product testing.
- Test hotfixes right away. This way, you can ensure that the developers have correctly analyzed the defects you found and made the necessary adjustments to the code. Be prepared, however, for testing to occur over time.
Testing artificial intelligence software, in this case, brought us many surprises. It was probably expected too much from the product, completely forgetting one nuance: self-learning systems always work the way you teach them. Once it was understood how this boomerang worked, it immediately came back with good news. The customer approved the product, went to market, and is still being supported.