Calibration Testing into the Test Plan

When defining the test scenario a set of assumptions is calculated into the Test Plan. For instance, the test scenario in The Test Scenario speculates a test case can achieve satisfactory throughput (in TPS) with a message payload request size of 500,000 bytes and 200 concurrent requests. A Calibration Test identifies a service agent's optimum throughput – measured in TPS at the consumer – against the given testing hardware and software. For instance, consider the data in Payload Size, Concurrent Agents, and Transactions -Per-Second Results.

Payload Size, Concurrent Agents, and Transactions-Per-Second Results list the two input values for the test: the message size sent to the service, and the number of concurrent test agents. Using these values, XSTest operates a test case by instantiating one thread for each concurrent request. Each thread dynamically generates the defined payload size data and sends it to the service as a request. Then the thread receives the server's response, validates the response, handles any exceptions, and logs the response as a completed transaction. The thread repeats the same steps until the test case period is finished. A Bar Chart of the Results Showing the Maximum Throughput Values displays the results from Payload Size, Concurrent Agents, and Transactions -Per-Second Results in a bar chart clearly shows the maximum throughput values.

A Bar Chart of the Results Showing the Maximum Throughput Values

Payload Size, Concurrent Agents, and Transactions -Per-Second Results and Scalability Index provide information about the service under test, including:

As payload increases, TPS reduces proportionately. The test is not saturating or under-utilizing the server, network, or consumer. If TPS increases, the testing level was not high enough, or if it was flat or dropping sharply, the testing level was not low enough.
The reduction in TPS is not proportional to the increase in request size. When the network and consumer are not at high enough activity levels, the service has a poor-performing request processor. One reason this could happen is if the message parsing system is not allocating resources (memory, network socket connections, or message queues) of the correct size for the demands of the test.
TPS takes a significantly larger reduction for test cases above 3000 bytes of payload. In this case, a code profiler is used to find a test experiencing a buffer overflow or an undersized object list.

While A Bar Chart of the Results Showing the Maximum Throughput Values identifies a few things about the service, additional information is needed to conclude. Values, listed in Parameter Values Required to Calibrate a Test, are needed for the test parameters to conclude.

Parameter Values Required to Calibrate a Test

In a stateless system, for each request, a service allocates its memory, CPU bandwidth, network bandwidth, and other resources needed to generate a response. For a stateless calibration test, resource bottlenecks are identified. Network and CPU Utilization shows the test scenario results including network utilization and server and consumer CPU utilization values.

Network and CPU Utilization

The results in Network and CPU Utilization give some idea of what is going on during the test scenario:

The test is server-bound preventing greater throughput (TPS). When payload sizes are less than 4000 bytes, server CPU utilization is high but not saturated. At 4000 bytes and greater, the CPU is saturated
Stateless tests require resources to handle the concurrent requests load. Take away a resource – CPU bandwidth or free memory – to operate on larger payloads and response times increase lowering overall TPS
The scale of the problem indicates there is a significant problem with the server. The payload size from 1000 to 5000 increases by a factor of 5, but TPS values decrease by a factor of 14, from 10.376 to 0.73In a stateless test, the TPS value should be proportional to the input

Since this is a stateless test, each request should be served from an independent group of resources (threads, memory, etc.). Watching CPU and memory utilization levels is an appropriate way to identify scalability and performance thresholds.

However, this is not the case for stateful services such as database and workflow applications. stateful services use data caches, and server queues and typically have session managers overhead. These items impact service CPU and memory utilization levels independent of the consumer request load.

For a scalability and performance test running in a defined software and hardware environment, a calibration test helps determine the appropriate service concurrent request levels and message payload sizes. The results show a Scalability Index for the service.