D.1.1.8.2: Report on QA Testing (Authzforce+Keyrock+Wilma Combined Stress Testing)

Future Internet Core Platform

Image FP7Portrait logo jpg

Private Public Partnership Project (PPP)

Large-scale Integrated Project (IP)

D.11.8.2: Report on QA testing (AZForce stress testing)

Project acronym: FI-Core

Project full title: Future Internet - Core

Contract No.: 632893

Strategic Objective: FI.ICT-2011.1.7 Technology foundation: Future Internet Core Platform

Project Document Number:ICT-2013-FI-632893-WP11-D.11.8.2

Project Document Date:2016-08-11

Deliverable Type and Security:PU

Author:Salvatore D’Onofrio

Contributors: Engineering Ingegneria Informatica S.p.a

1.Introduction

1.1.Attributes of the GEri to be tested:

1.2.Attributes of the GEri to be integrated in the test:

1.3.Attributes of the testing tools:

1.4.Developed testing tools

1.5.Non-functional metrics

2.Testing Summary

2.1.GEs overview

2.2.Tested Scenarios

2.3.Results overview

3.Test case 1.

3.1.Test case description

3.2.Test results

3.2.1.Throughput:

3.2.2.HTTP Responses:

3.2.3.Response Times:

3.2.4.Requests/second:

3.2.5.Request Summary:

3.2.6.Threads:

3.2.7.Monitoring:

4.Conclusions

1.Annex

1.1.Annex 1: Test environment

1.1.1.Attributes of the hosting machine 1 (Wilma server)

1.1.2.Attributes of the hosting machine 1 (KeyRock server)

1.1.3.Attributes of the hosting machine 1 (AuthZForce server)

1.1.4.Attributes of the hosting WEB server

1.1.5.Attributes of the testing (client) machine

1.2.Annex 2: JMeter Results references

1.Introduction

The purpose of this document is to present the results of a performance test carried out on three integrated Generic Enablers: IdM, PEP Proxy and PDP, more specifically their reference implementation named,respectively, KeyRock, Wilmaand AuthZForce. For the preparation of these tests, the most used methods to access a configured IdM/PDP application resource have been taken into account; in fact the tests involved theAPI dedicated get authorization decisions based on authorization policies, as well those for authorization requests from PEPs.

For the execution of these tests a dedicated environment has been setup which consists of five virtual machines, detailed subsequently: one for the deployment of AuthZForce, one for the deployment of KeyRock, one for the deployment of Wilma, one to inject JMeter load and collect the results generated by the Generic Enablers and finally one to simulate the application resource –through an Apache webserver- to be protected by the system System resources such as RAM and processors of the GEs machine have been also monitored and collected during the tests execution.

1.1.Attributes of the GEri to be tested:

Attribute / Value
Generic Enabler / Authorization PDP
Chapter / Security
GEri name (implementation) tested / AuthZForce
GEri version tested / 5.3.0
GEri owner / Orange
Organisation performing the test / ENGINEERING ING. INF. SPA
Docker file link / N/A
Attribute / Value
Generic Enabler / IdM Identity Management
Chapter / Security
GEri name (implementation) tested / KeyRock
GEri version tested / 5.3
GEri owner / UPM Universidad Politécnica de Madrid
Organisation performing the test / ENGINEERING ING. INF. SPA
Docker file link / N/A
Attribute / Value
Generic Enabler / PEP Proxy
Chapter / Security
GEri name (implementation) tested / Wilma
GEri version tested / 5.3
GEri owner / UPM Universidad Politécnica de Madrid
Organisation performing the test / ENGINEERING ING. INF. SPA
Docker file link / N/A

1.2.Attributes of the GEri to be integrated in the test:

The tests described here don’t involve any interaction with further GEri

1.3.Attributes of the testing tools:

Attribute / Value
Load Test application / Apache JMeter version v 3.0
System monitoring tool / NMON (
System Data analyser / NMON Visualizer (Version 2015-10-21)
Java (JRE used for JMeter and NMON Visualizer) / OpenJDK Runtime Environment
(IcedTea 2.6.3) (7u91-2.6.3-0ubuntu0.14.04.1)
Java (JDK and JRE used for developing and running data provider/consumer mock-up components) / N/A

1.4.Developed testing tools

No custom tools were developed in addition to the JMeter script plans, but a web server (apache) was setup in a dedicated VM (described in the annex as“Web Server“) in order to simulate the back-end (dummy) service configured for the proxy to be tested.

The web server was configured in order to reply with a valid index.html file to the following URLs

1.5.Non-functional metrics

The next table shows the metrics that were measured during these tests.

Class / Metric id / Metric / Yes/No
Performance and stability / MPS01 / Response times for <n> concurrent threads / Yes
MPS02 / Number of responses per second / Yes
MPS03 / Number of Bytes per second (Throughput) / Yes
MPS04 / HTTP response codes / Yes
MPS05 / Error rate / Yes
MPS06 / Maximum threads number handled / Yes
MPS07 / CPU usage / Yes
MPS08 / Memory usage / Yes
MPS09 / Network traffic / Yes
Scalability and elasticity testing / MSE01 / Scalability / Yes
MSE02 / Elasticity / No
Failover management and availability / MFA01 / Failover / No
MFA02 / Availability / No
Security / MSC01 / Security / Yes

2.Testing Summary

A dedicated FIWARE Lab was configured in order to host the implementation of this GEs (Generic Enablers). Through this environment, based on the cloud operating system OpenStack, it was possible to easily create and configure all the Virtual Machines needed for the stress tests. In particular this environment consisted of five virtual machines of which three were dedicated to the deployment of AuthZForce (Authorization PDP), Wilma (PEP Proxy) and KeyRock (Identity Management),one was dedicated to the Web Server and the one was instead needed to inject the JMeter load and collect the results generated by the GE; usage of system resources such as RAM and processors of the GEsmachines was kept monitored and recorded during each test execution directly on each VM hosting a GE implementation. All the details about the HW configuration (CPUs, RAM etc.) as well as the O.S of each VM and required installed SW tools are detailed in the Annex of this document.

2.1.GEs overview

Refer to the stand alone GE test document reports for an overview of the involved GEs.

2.2.Tested Scenarios

The performed test involves simultaneously all GEs, and following are described the main steps:

The requester (JMeter) sends to the PEP Proxy GE anaccess request with a valid token inthe HTTP header, as expected by the proxy itself.
When the PEP Proxy receives theaforementioned access request, it extracts the access token from its header and sends it to the IdM GE for validation. If it is valid, the IdM GE returns the validation result and other token-related information, such as information about the authenticated user. If the token turns out not valid, the request is not authenticated therefore denied.
The PEP Proxy sends to the Authorization PDP API an XACML authorization decision request that contains this IdM-issued information with further information about the access request such as the Requested resource ID and theaction ID (HTTP method);for example, withinthe abovePDP API request, the PEP Proxy encloses the URL requested for the resource, the HTTP method for the action ID, and authenticated user attributes. The Authorization PDP GE computes the authorization decision – Permit or Deny – and returns it to the PEP.
If PDP’s decision is Permit, the PEP forwards the API request to the protected service, and forwards the response back to the requester. If the decision was Deny, the PEP denies the request, for instance replies with a HTTP response 403 (Forbidden).

Test Period / Number of concurrent requests (threads)
30 minutes / Gradually increase from 1 to 50 in the first 5 minutes
50 – fixed for 25 minutes

2.3.Results overview

The results collected and analyzed during these tests led to the following main conclusions:

1)Error ratedetected in all the GEs was 0%.

2)The measured Response time had a goodaverage value of 464 ms (min: 24ms, max: 3963ms), but with a number of concurrent threads bigger than 30, response timesincreased significantly up to almost 4 s.

3)The measured Loadshowed anaverage good value of about 99 transactions/sec (which remained stable also when the number of concurrent threads increased) .

4)GEs never crashed. Load tests were handled by the system without showing crashes.

3.Test case 1.

3.1.Test case description

In this scenario the test was executed for 30 minutes. In the first test part (first 5 minutes), the number of concurrent users (threads) was variable and incremental: the test started with 1 running thread and in 5 minutes 50 running threads were reached (this was useful to find out the maximum production point). In the second half, the number of concurrent requests was steady with 50 running threads in order to understand the system behavior with a stable input throughput.

To take in consideration that each thread, once completed the current transaction, sends a new request to the system.

In order to implement this test scenario the following precondition steps were executed:

Authenticated Requester:

The requester must be previously authenticated by the IdM GE using OAuth2 flow, and got an access token from the IdM GE as a result.

Register Application:

From KeyRock GUI, a new application must be registered with the following attributes

Attribute Name / Attribute Value
Name / BundleTestApp
Description / Bundle Test Application
URL /
Callback URL /

Manage Roles:

A new Role must be addedin the Application with the following attributes

Attribute Name / Attribute Value
Name / BundleTestAppRole

Add a new PermissionIn the Application with the following attributes

Attribute Name / Attribute Value
Permission Name / BundleTestAppPermission
Description / Bundle Test Application Permission
HTTP action / GET
Resource / BundleTestAppPermissionResource

The Permission BundleTestAppPermission must be assigned to the Role BundleTestAppRole

Authorizations:

In the Application User Authorization section of the KeyRock GUI the Role BundleTestAppRoleto the current user must be set.

Application Authorization String:

From the Application Info Section get the following OAuth2 Credentials attributes

Client ID
Client Secret

Use the obtained values to create the base64 value from the string ‘<Client ID>:<Client Secret>’ (to perform this encode operation the online utility at was used).

The obtained value is the Application Authorization String

Application User Authorization Code:

In order tocreate the application user authorization code, a request like the following one must be prepared and invoked within a browser (it’s possible the account credentials will be required) as well as subsequently authorized.

<ClientID>&state=xyz&redirect_uri=

If all is OK, the apache page will be opened. Now the attribute code value present in the URL which is the Application User Authorization Code, must be obtained

Application Token:

Obtain the Application Token using the following API HTTP on KeyRock

ID / GE API method / Operation / Type / Payload / Max. Concurrent Threads
1 / oauth2/token / Application Token Creation / POST / Header:
Content-Type: application/x-www-form-urlencoded
Authorization: Basic <Application Authorization String>
Data:
‘grant_type=authorization_code&code=<Application User Authorization Code>&redirect_uri= / 1

Wilma configuration for KeyRock access:

On KeyRock, from the Application Info Section a new proxy must be registered and the following attributes obtained:

Username
Password

Use the obtained values to configure on Wilma (file config.js) the following attributes:

config.username
config.password
config.app_host=’apache’

HTTP vs HTTPS

Considering that AuthZForce was configured on HTTP and not with HTTPS protocol, a change was implemented on Wilma on file azf.js in order to force the connection to HTTP

ns5 versus ns6

A change was implemented on Wilma on file azf.js in order to allow Wilma to accept the response payload having name solving version 5 instead of version 6

Wilma caching disabled

A change was implemented on Wilma on file idm.js in order to prevent caching and force Wilma to interact always with Keyrock and AuthZForce systems.

Test API:

Once fulfilled the precondition indications it’s possible to perform theauthorization checks using the following Wilma HTTP API:

ID / GE API method / Operation / Type / Payload / Max. Concurrent Threads
1 / /BundleTestAppPermissionResource / Get Authorization / GET / Header:
x-auth-token: Application Token / 50

3.2.Test results

Duration
Start time / 15:40:02
Stop time / 16:10:02
Total time (Stop – Start) / 30 minutes
Total sent requests / 177.966
Total threads created / 50
Threads creation rate / 1 threads/6 sec
up to 50 Simultaneously
Threads ending rate / Simultaneously

3.2.1.Throughput:

The following graph shows the network traffic measured with the nmon utility on Wilma, Keyrock and AuthZForce nodes during the test session.

Throughput (Bytes) / Read / Write
Total / 2.748.060.000 / 2.963.817.000
Average (per second) / 1.526.700 / 1.646.565

Throughput (Bytes) / Read / Write
Total / 111.216.845 / 147.470.400
Average (per second) / 63.269.888 / 81.928

Throughput (Bytes) / Read / Write
Total / 388.859.904 / 179.197.747
Average (per second) / 216.033 / 99.554

The following graph shows the bytes exchanged between JMeter and Wilma application.

Throughput (Bytes) / Total
Total / 2.144.668.266
Average (per second) / 1.191.539,299

The following graph shows the number of Bytes exchanged by Jmeter and AuthZforce for this test scenario.

3.2.2.HTTP Responses:

This test scenario accepts two kinds of HTTP result codes: 200 (OK) and 401 (Unauthorized), but the last one never happened.

HTTP Responses (number) / HTTP 200 / HTTP 401
Total / 177.966 / 0
Average (per second) / 98,875 / 0

3.2.3.Response Times:

The chart below, along with the related tables, shows that the average response time is around 464 ms which can be considered an acceptable mean response time. However, several high RT peaks (1s<RT<4s) were detected (17% of the total responses) once the input load exceeded 30 threads; this threshold and related values from which it comes from, are not depicted in the chart due to the sample interval, but they are represented in the tables below which are grouped by number of threads in order to actually identify when the system start to underperform.

Response times (milliseconds) / Operation 1
Minimum / 24
Average / 464
Maximum / 3.963

Test results up to 9started threads

Response times (milliseconds) / Number of Transactions / Percentage of Total Transactions
>= 3 sec / 0 / 0
>= 2 sec < 3 sec / 0 / 0
>= 1 sec < 2 sec / 0 / 0
< 1 sec / 4.599 / 100
Total / 4.599 / 100

Test results up to 19 started threads

Response times (milliseconds) / Number of Transactions / Percentage of Total Transactions
>= 3 sec / 0 / 0
>= 2 sec < 3 sec / 0 / 0
>= 1 sec < 2 sec / 2 / 0,033563
< 1 sec / 5.957 / 99,96644
Total / 5.959 / 100

Test results up to 29 started threads

Response times (milliseconds) / Number of Transactions / Percentage of Total Transactions
>= 3 sec / 0 / 0
>= 2 sec < 3 sec / 0 / 0
>= 1 sec < 2 sec / 15 / 0,250543
< 1 sec / 5.972 / 99,74946
Total / 5.987 / 100

Test results up to 39 started threads

Response times (milliseconds) / Number of Transactions / Percentage of Total Transactions
>= 3 sec / 0 / 0
>= 2 sec < 3 sec / 0 / 0
>= 1 sec < 2 sec / 369 / 6,18816
< 1 sec / 5.594 / 93,81184
Total / 5.963 / 100

Test results up to 49 started threads

Response times (milliseconds) / Number of Transactions / Percentage of Total Transactions
>= 3 sec / 0 / 0
>= 2 sec < 3 sec / 1 / 0,016776
>= 1 sec < 2 sec / 979 / 16,42342
< 1 sec / 4.981 / 83,55981
Total / 5.961 / 100

Test results up to 50 started threads

Response times (milliseconds) / Number of Transactions / Percentage of Total Transactions
>= 3 sec / 83 / 0,05552
>= 2 sec < 3 sec / 1.253 / 0,838144
>= 1 sec < 2 sec / 27.681 / 18,51609
< 1 sec / 120.480 / 80,59025
Total / 149.497 / 100

3.2.4.Requests/second:

The average number of transactions is a little bit under the value of 100/second.

In the table below a little schema of the obtained results

Requests/second (number) / Operation 1
Average (per second) / 98,875

3.2.5.Request Summary:

From the error rate point of view this test can be considered 100% successful.

Requests summary (number) / Operation 1
Successful / 177.966
Failed / 0
% Error / 0 %

3.2.6.Threads:

This graph shows the growing simultaneous requests during the test period.

3.2.7.Monitoring:

During the test also CPU and Memory usage were monitored and in the following sections the results are described.

3.2.7.1.CPU usage:

As showed by the “Total Request” graph, the max performance point was reached very close to the test start time (about 1 minute later).Once the input load increased (adding new threads), the system reached its limit with 10-12 concurrent threads; as shown by the charts below, this happened because KeyRock reached quickly the 100% CPU usage leaving no free resources for newthreads; on the other hand,Wilma and AuthZForceshow instead a low usage of CPU.

3.2.7.2.Memory usage:

Memory usage confirmed what already observed during single system performance test that is all systems showed a normal usage of memory and no leaks were detected.

Below graphs of the measured results.

4.Conclusions

Results collected and analyzed during these test session show a system with good performances. In fact, the average response times of 464 ms allowed the system to handle an average of about 99 transactions per second.

In terms of concurrent users (threads) the system can guarantee an average response time of at most 450 ms with a number of concurrent threads lower than 47. This threshold value can be easily identified by comparing the previous chart of the “threads” to that one of the “response times” also showing the same linear growth trend.

About CPU, Wilma is stable under the 20-25% of usage, AuthZForce is stable under the 20-25% of usage and Keyrock reach quickly the value of 100% of usage.

The most production point was found with 8-10concurrent running threads and 100% of CPU usage on KeyRock; all the threads started later than that point didn’t register better system performances, but, instead, they generated only delay in the response times.

To conclude we can resume with the following points

1)Error ratedetected in all the GEs was 0%.

3)The measured Loadshowed anaverage good value of about 99 transactions/sec (which remained stable also when the number of concurrent threads increased) .