The LLM Revolution SAP Testing


Testing encompasses a whole range of activities. At both ends of an SAP project are unit tests (developers) and user acceptance tests (key users). In between, functional testing is the main activity – accounting for around 80 percent of the costs – supplemented by non-functional tests (e.g., security, performance, and now also UX tests). SAP testing differs from testing individual applications. Although the basic concepts remain the same, the tools are usually different – SAP has now settled on Tricentis as its standard. In addition, SAP testing places a stronger focus on integration tests and data flow between applications. Data – i.e., data exchange, tables, and master data – has a much higher priority in the SAP world than in individually developed applications.
Over the past twenty years, functional testing has remained largely unchanged and has been based on a fundamental process: requirements or user stories, test cases, and test scripts. There have been numerous attempts to automate this process—remember „scriptless testing frameworks“? With the advent of agile methods and DevOps tools, new platforms emerged, whether they were called DevOps or continuous testing platforms. The industry talked about hyperautomation. Nevertheless, the process of requirements, test cases, and scripts remained central.
Huge productivity gains
Why this historical overview of testing? LLMs will fundamentally change testing. If user stories are reasonably well formulated, LLMs can generate test scripts in minutes—instead of weeks. The productivity gains are impressive, and the cost of LLMs is minimal compared to human labor (and continues to fall). Writing test scripts no longer requires test engineers—it is automated. In the future, functional testing will therefore no longer be a bottleneck in SAP projects.
A revolution is underway that will have far-reaching effects in the coming years—it will change the tool ecosystem, make software licenses and subscriptions obsolete, and significantly reduce testing efforts. LLMs are ideally suited for greenfield, grow, or public cloud projects. The nature of SAP, with its reliance on standardized business processes, makes it ideal for such projects in functional testing. Standardized SAP business processes are well documented and easily accessible and can be used to create test artifacts.
The use of LLMs becomes more difficult in brownfield (and bluefield) environments—which, it is fair to say, account for the majority of all S/4 projects. From a testing perspective, the priority here is no longer to create new test artifacts from scratch, but to understand the existing test inventories.
Most organizations (figures are difficult to obtain) have several thousand test artifacts (also difficult to quantify), including requirements, test scenarios, test cases, and scripts. These artifacts are both an asset (they reflect the organization's business processes at the SAP transaction level) and a burden.
The next step, then, is to understand these artifacts: their function, their quality, and their relevance. This is by no means the first time organizations have reviewed their test inventories—the last attempt, several years ago, used NLP to identify redundant test cases. This time, however, LLMs offer reverse engineering capabilities: they can generate test cases from test scripts or user stories. In other words, LLMs can help determine which test scripts actually need to be executed, allowing organizations to focus on what's essential—rather than working through their entire test inventory.
Cost savings, but where?
This is the first step in using LLMs in quality engineering. The world of LLMs is full of opportunities—but also full of challenges. While LLMs bring enormous productivity gains, these do not necessarily lead to significant cost savings. This needs to be better understood. However, we do have some pointers: Testing is a multi-step process – beyond functional testing – and involves bottlenecks (e.g., the availability of test environments) that can affect the overall duration of the test cycle.
Another aspect is how test experts will use LLMs. Will they use the productivity gains to shorten test cycles? Or will they use some of the time saved for additional quality checks and root cause analysis to improve software quality? That would be a great development—because software testing and software quality have never been as closely linked as they are today.
Continue to the partner entry:








