Speech about Fast and Reliable Software at MF Summit on 20-21.June 2017 in Düssledorf

Come and visit the Microfocus Summit in Düsseldorf on 20.-21.June 2017. I will talk about fast and reliable software on 21.June 11:00 am. Registration, agenda and more details under this link.

 

 

 

 

 

Advertisements

Why Companies Invest in Load and Performance Tests

Over the past decade companies, large and small have started with the integration of load and performance testing into their development process. There are many good reasons for this evolution. In this post will outline why testing of performance requirements has become so popular.

Naturally, load & performance testing reduces the risk of severe outages or system failures dramatically. Even if hundreds of your software testers conduct thousands of manual tests on a new application, this is no guarantee that the system can handle the expected load pattern on production. A typical application works fine under normal load or data volume conditions. Depending on the design and implementation decisions the end-to-end response times often rise when the new system needs to handle a certain number of concurrent user requests.

Our professional user community expects high-speed applications, and with the growing service portfolio, your user sits more and more on the driver seat. The response time of a business application determines whether you hold your customers or lose commercial revenue. Optimized high-speed applications will make customers happy, and satisfied users will often return. Therefore, the spending in performance testing is an excellent investment.

Companies invest plenty money in search engine optimization and neglect an even more important aspect, the actual speed of their application. Maybe this is still not well known, but search engines rate applications with low speed or pure application design down. Obviously, it is better to improve the response time prior making investments in search engine optimizations.

Hardware sizing is often guesswork with a sad end. Oversized infrastructure is a waste of money, and a shortage of system resources can have a severe impact on user experience. Load and performance tests eliminate this try and error exercises. Simply simulate the average load on the new system and verify if system resource utilization is within the recommended boundaries is enough for an appropriate investment in IT infrastructure.

Our online business relies on image and reputation. Major players such as Amazon or Apple understood that slow loading or not responding applications lead to a negative user impression and damages their reputation as a leading actor. Continuous performance tests help to detect hotspots proactively and protect your users from this nasty touch of not reliable applications.

Organizations working in the firefighting mode understood that this endless war room sessions are frustrating for customers and also for IT staff. You can’t tune and eliminate performance hotspots on production because this kills your trustability. Users will forget failures in your product, but they will always remember that this application is slow.

Stop wasting your money and integrate performance testing into your development process.

Mastering Load and Performance Tests in 7 Steps

The best way to avoid bad user experience is to integrate load and performance testing in your development process. In this post, I will shine a light on the necessary steps involved which you can use right away in your next performance test project.

Step 1 – Performance Requirements

Business analysts specify their performance expectations for the new application. They outline actual and future growth load patterns including transaction rates, user volumes and response time expectations. Make sure that you specify speed expectations with 90th percentiles instead of average values because they lie.

Step 2 – Test Plan

Performance Engineers review nonfunctional requirements and develop an appropriate performance test plan which contains performance requirements, the scope of this test, responsibilities, test-data, environments, load patterns, relevant use cases, scheduled test runs, and success criteria. Agree on status reporting and defect tracking with test and development teams involved. Make sure that the application owner approves this test approach prior you start with the implementation.

Step 3 – Test Environment

Test teams agree about activities on the used environment. Load and performance tests will be executed in exclusive slots to avoid the negative impact of background activity. The sizing of your performance test environment can be a critical factor. Therefore, make sure that you execute your tests in a production-like sized setting.

Step 4 – Implement Test

Engineers decide on the simulation approach and develop the necessary test scripts. Depending on the technology of your application under test and the goal of your performance test you will implement protocol-level, headless-browser of real browser based scripts. Some load testing solutions provide a script recorder which allows convenient test script creation without any programming skills. The tester just executes the use case manually while the recorder captures the interactions and creates the test script. With open source tools, the script implementation often requires programming skills.

Step 5 – Configure and Execute Test

The application owner enables system resource monitoring while test staff compiles their scenarios. They setup component level tests to run on a daily basis and end-to-end test scenarios to run on the fully integrated production like environment. Don’t forget to inform dedicated teams about your test execution slot’s to avoid unexpected downtimes during your test.

Step 6 – Result Analysis

After the test execution, test staff collects all results including response times, log files and system resource metrics. Start from the end users perspective and verify if response times are within the expected boundaries. Continue with horizontal and vertical analysis to identify the performance hotspots. Based on my experience, in more than 90 % of the executed tests bottlenecks will be determined, defects need to be raised, and the root-cause analysis starts.

Step 7- Tuning

If the performance is not acceptable corrective actions must be conducted. Engineers lay out their test approach, discuss actual results with developers, infrastructure, database, network and architect teams. This virtual team decides on tuning measures, implements corrective actions and repeats the test until performance is acceptable.

That’s it. The earlier you start testing your performance requirements, the easier you can fix identified issues. Be proactive and eliminate critical hotspots before they affect your business users. Good performance – happy customers – growing revenue!

Implement your Automated Web Page Design Analysis

In recent years, static websites completely disappeared and with the rise of technology, companies provide their services more and more online. The former latent web pages were replaced with content-rich and dynamic websites. Frequent changes in content and website layout can have a high impact on end to end response times.

In this post, I will give you a simple approach how to implement an automated web page design analysis based on open source tools PhantomJS and Yslow.

Setup Details

The good thing is that all components required for this automated web page design analysis are open source and for free. You will need the following tools:

  • PhantomJS – a headless Webkit and automation solution
  • YSlow – page design best practices
  • Atom – a powerful editor for easy scripting and test execution

First of all, install PhantomJS on your environment. There is a detailed installation description on their website which you can use right away.

Secondly, download Yslow for phantomJS and customize the yslow.js file. Open the yslow.js file in any editor, add the line var system = require(‘system’); to the top of this file and replace all phantomjs.args with system.args in this file.

Thirdly, install Atom editor and enable the run command. Atom is an extremely powerful editor with tons of plugins. I used it’s run command to execute command line scripts.

Finally, test the installation with the given command below. You can use Atom after you enabled the run command, open a new window, insert the command phantomjs yslow.js –help and click ctrl-r.

Run the Analysis

PhantomJS and Yslow are powerful tools and provide many features which you can use right away for your automated web page design analysis. Personally speaking, I recommend starting with the basic command and work your way through the more advanced features.

Basic

In this mode, you will get a high-level page design analysis which consists of the size of your page, the overall score, the number of requests and the page load time. Execute the command below on your machine configured before.

phantomjs yslow.js -info basic -format plain http://focusaps.com

The picture below contains the output of this command. It shows that the given website has an overall page design score of 76 out of 100, has a size of 1.5 MB and a load time of 3.2 seconds.

yslow basic plain

Detailed

The detailed mode provides more insights to the weak areas of your website. It supports also predefined thresholds and supports the TAP output format which is supported by many tools such as Jenkins. Run the command below on your machine.

phantomjs yslow.js -info grade -format tap – threshold C http://focusaps.com

You will get the following output including relevant tuning hints which you can share with your developers.

yslow grade tap output.jpeg

I believe that you have now many integration ideas for the page design analysis. Automation is that easy. Add the automated checks to your build process, testing procedures and daily checks on your productive environments. You will see that this really helps to identify deviations in a proactive way.

Web Page Design Analysis is not a one-off Exercise

Small things matter most, and this is not only true for day to day activities. Minor changes in application configuration can have a significant impact on end to end user experience. In this post, I will give you insights into the nature of such changes and some simple steps towards proactive detection of speed degradations.

Changes and their Impact

Frequent modifications in the look and feel of websites are very much appreciated. Nowadays, websites are not only used for advertisement or gain commercial revenue. Companies try to stand out of the crowd and design websites which underline their image. Navigation has become easier since web designer understood that the number of clicks required to buy products is essential for their business.

Very common web page design failures are the absence of compression, large images, blocking java scripts and videos in auto play mode. Software test professionals are extremely familiar with those nasty pitfalls. They detect and eliminate those during QA stages. However, once the new business application has been deployed at production, nobody cares about the impact of this minor changes and slowly the speed of the former quick loading website is gone. Suddenly, frustrated users drop their shopping trip on your side, and financial revenue declines.

Automated detection of slow downs

You can avoid this frustrating scenario above. It’s not a rocket science and eventually easier than you could imagine. Quality assurance does not end post deployment of the new website on production. You won’t have a test plan for your live system, but an automated health and performance monitoring solution with significant test cases is required.

There are great cloud-based or locally hosted monitoring platforms out there which you can use. Replace the reactivity with a proactive health monitoring solution. Automation is a great feature when it comes doing repetitive things such as periodically execution of your monitoring test cases. Setup the performance and availability boundaries and let alerts flow out if your website slows down for whatever reason.

It’s good to now that the speed of a website is below expectations, but this helps you nothing if nobody is there who dives deeper, understands the cause and fix this issue. As already mentioned above there are those minor adjustments which could impact the end-to-end response time. Based on my experience, a good way to detect the real problem behind a slowdown is to implement QA checks used on testing stages also on production.

Actionable insights

As a performance engineer, I verify page design of new applications on pre-production. Google and Yahoo provide powerful tools which make this analysis quite easy. The good thing is that those solution detects issues and provide more insights to the actual root-cause such as disabled caching, large images or blocking java scripts. It makes much sense that your health monitoring solution also checks the page speed score of your web pages on a regular basis.

Recently, during some research for another paper, I became aware that automated page design analysis is almost for free. In my next post, I will outline how you could implement your self-made automated page design monitoring solution based on PhantomJS, Netsniff.js, and Yslow.

Spotlight on the Top 3 Performance Test Types

Performance testing is meanwhile a fundamental step in many software development projects. Test early and repeat often is also true for load and performance testing. It’s not a one-time shot and there are some pitfalls involved. In this post, I will outline the three most frequently used performance test varieties.

Component Speed Tests

In recent years software development methods have moved in the agile direction. Short release sprints are essential. Developers and test engineers automate their quality assurance and performance checks. Typically, they implement service based performance tests on the protocol level, or they simulate real browser-based performance checks to compare end-to-end response times with agreed performance boundaries.

Objectives
+ Repeatability
+ Automated interface and end-to-end performance checks
+ Compare response times with agreed thresholds

Load Tests

Load tests are the ideal setting when it comes to verification of non-functional requirements. One being that response times can be verified under reproducible conditions. Another on being that those tests allow verification of speed thresholds. Realistic response time measurement is essential in load test scenarios. Therefore, test engineers use headless or real browser-based user simulation for their load test settings.

Objectives
+ Reproducible load simulation
+ Verification of response time thresholds
+ Identify bottlenecks under production like load conditions
+ Realistic end-to-end test scenarios

Stress Tests

Consider a stress test if you have to proof reliability of your application under peak load conditions. In this type of test, you specify mainly the max number of users and the time over which the ramp up and the steady state load should be on your application. The goal is to identify the breaking points of your application under test. Often this type of inspections are not 1:1 reproducible because the simulated load varies based on the actual response time of the application under test.

Objectives
+ Proof scalability and stability
+ Simulate peak load conditions
+ Exact reproducibility is not relevant

Choosing the wrong type of performance test can be mission critical and put your trustworthiness at risk. I highly recommend reviewing the goal of your performance test before you start with it’s implementation.

Guideline for Choosing the Optimal User Simulation Approach

Testing and Monitoring platforms provide a broad range of user simulation methods such as protocol, headless, and real browser-based. In this post, I will outline the main aspects of those followed by a comparison matrix which you could use for choosing an appropriate simulation approach.

Protocol-level Simulation

Protocol level or HTTP-based testing was very popular in the early days of our digital age. With the rise of rich web client technology the proved simulation approach has been more and more outdated.

A typical HTTP-based test driver executes service requests and parses responses. Modern web 2.0 applications consist of many client-side scripts, which are totally ignored and not measured, in this type of test execution. In worst-case complex use cases cannot be simulated on protocol level due to a shortage of client side generated ids.

Complex protocol based use cases can be difficult to implement. A performance engineer needs to deal with cookies, session ids, and other dynamic parameters. Depending on the used technology of your system under test some web form names often change once a new version have been deployed which will cause the HTTP-based script to fail.

// Sample SilkPerformer protocol level script
transaction TMain
var
hContext: number;
begin
WebPageUrl(“http://lab3/st/”, “Greetings”);
WebPageStoreContext(hContext);
WebPageLink(“Join the experience!”, ” – New Visitor”);
WebPageSubmit(“Continue”, CONTINUE001, “Main menu”);
WebPageLink(“Products”, “ShopIt – Products”);
WebPageLink(NULL, “ShopIt – Product”, 3);
WebPageSubmit(“Search”, SEARCH001, ” – Search”, 0, NULL, hContext);
end TMain;

dclform
CONTINUE001:
“name”            := “jack”,
“New-Name-Button” := “Continue”;
SEARCH001:
“search”          := “boot”;

After all, protocol level scripts are perfect for service based tests in continuous integration environments, uptime monitoring or also for stress tests.

Headless Browser Simulation

With the rise of web 2.0 technologies the testing and monitoring business was faced with serious challenges. Rich browser applications could no longer be tested or simulated on the protocol level due to the missing client side logic during script replay.  Therefore, several headless browsers have been introduced such as HtmlUnit, PhantomJS or SlimerJS. They are often built on top of WebKit, the engine behind Chrome and Safari.

Headless Browsers have all the advantages of real Browsers, and they run faster without the heavy GUI. Many test automation, monitoring, and performance testing platforms are using headless browsers as they allow realistic user simulation.

Some tool providers have built their headless browser engines and ran in maintenance pitfalls. They have to keep the pace with new browser versions. It’s highly recommended using free available headless browser kit because there is a broad community which works on improvements.

// Sample phantomjs script
“use strict”;
var page = require(‘webpage’).create(),
server = ‘http://posttestserver.com/post.php?dump’,
data = ‘universe=expanding&answer=42’;

page.open(server, ‘post’, data, function (status) {
if (status !== ‘success’) {
console.log(‘Unable to post!’);
} else {
console.log(page.content);
}
phantom.exit();
});

Finally, headless browsers are ideal for test automation, performance testing, and SLA monitoring.

Real Browser Simulation

Web2.0 applications are full with JavaScript, Flash, Ajax and CSS. Without a full browser, it’s not possible to measure the actual end-to-end response times of the whole web page. Real Browser Monitoring allows you to verify the site’s functionality and performance as perceived by the end-user.

A typical real browser monitoring solution collects loading times of images, javascript, CSS and more. Often they provide waterfall charts, which visualize the load time of those components.

One of the biggest disadvantages is their footprint. A single real browser script requires much more CPU and memory resources on the load injection or monitoring machine. Therefore real user based simulation is not recommended for stress test scenarios.

// sample SilkPerformer real browser-based script
transaction TMain
begin
BrowserStart(BROWSER_MODE_DEFAULT, 800, 600);

// navigate to the login site
BrowserNavigate(“http://demo.com/TestSite/LoginForm.html”);
// set the authentication for the secure site
BrowserSetText(“//INPUT[@name=’user’]”, “BasicAuthUser”);
BrowserSetPassword(“//INPUT[@name=’pwd’]”,     Decrypt3DES(“Ax7/X9sk1kIfHlbAZ434Pq4=”));
// submit the form
BrowserClick(“//INPUT[@value=’Submit Query’]”, BUTTON_Left);
end TMain;

After all, real browser simulation is useful for acceptance testing or SLA monitoring. Don’t use it for performance or stress testing because the resource footprint is too high.

Comparison Matrix

Obviously, there are good reasons for protocol, headless or real browser-based user simulation. The table below provides some guidance to choose the appropriate approach.

Criteria HTTP Headless Browser Real Browser
Realistic User simulation

 

No Yes, but with some limitations Yes
Easy script creation

 

 

No, depends on application complexity Yes Yes
Robust script replay Yes Yes, sometimes tricky tweaks required Yes
Easy Script maintainability

 

No Yes, debugging could be tricky Yes
Multi-Browser Support

 

Yes Yes, but with some limitations Yes
Low footprint on load injection machine Yes Yes No
Good for continuous integration

 

Yes No No
Good for performance tests

 

Yes Yes No
Good for

stress tests

 

Yes No No
Good for uptime monitoring

 

Yes Yes No
Good for SLA monitoring

 

No Yes Yes

Keep doing the good work and share your experience with me.

Transform Performance Metrics to Actionable Insights

Digital services are eating the world. Billions of websites provide almost everything you can image, and the competition is on the rise. End users are more and more in the driving seat because they decide what service they will use.  Digital customers often ban services which provide a bad user experience.  Therefore, excellent application performance becomes a top priority for many organizations. After outlining both, performance metrics and the term actionable insights, I will illustrate how you can transform the former to the latter.

According to technopedia, actionable insights  are

“analytics result that provides enough data to make an informed decision.”

When it comes to performance engineering, specialists collect metrics which help them to understand the root-cause of hotspots. Decades ago it was very common that slowdowns were solved with additional hardware. Those times are gone. Nowadays, engineers have to deal with questions such as:

  • Why is the number of frustrated users on the rise?
  • Why is page speed score below the target?
  • Why does service response time exceeds allowed thresholds?
  • Why is the average size of our web page > 1 MB?
  • Why is our system not scalable?
  • Why is our application running out of memory?

When it comes to answering those questions, you can follow a trial and error based approach or derive the tuning recommendation from your collected performance metrics. I highly recommend following the latter because the former will result in endless war rooms trial and error exercises.

What are the typical performance metrics

1. User layer. Modern real user monitoring solutions provide powerful insights to activities performed by end-user on your applications. They often use javascript injection approach and add small functions to your web pages which allows a detailed capturing of last-mile performance figures such as actions performed, client errors, actual bandwidth, client times, rendering time and much more.

2. Service layer. User interactions lead to service calls and whenever a slowdown arise you should be able to identify the time beeing spend on your middleware. Typically, those service based metrics include response times, throughput, error rates, exceptions, 3rd party service call figures, heap statistics and much more.

3. System layer. Network, CPU, memory and IO metrics are also critical factors for our applications. If they are often above acceptable thresholds, they can quickly influence overall application performance.

Extensive data analytics and data processing

Almost every business application collects some log file data, but if you intend to nail down performance hotspots, the information provided in log files is most likely not useful. You will need each, and every user transaction to be taken into account including your horizontal components such as web server, application server, and databases as well as 3rd party services. Obviously, you will deal with big data volumes. Monitoring and testing platforms are nowadays well equipped to manage millions of such transactions.

Make a proper course of action

Forward-thinking players started with the implementation of performance analytics engines. Their vision is clearly the reduction of operational efforts, and their mission is to make everyone act as a  performance engineer. In the past, specialists were responsible for the correlation of those insights. Such experts are aware of many slow-down patterns, and their radar identifies those quickly.

In the past, experienced engineers transformed performance metrics to actionable insights. However, in recent years automation is slowly taking over. Algorithms decide forthwith about the acceptance of user experience and point out the root-cause of performance slowdowns. Such artificial performance advisers are still in early stages, but there is an immense potential.

The need for Continuous Performance Optimization

Nowadays, information technology is at the heart of every business. Outages or slow downs in critical software components often impacts the whole organization.

According to research from Aberdeen performance is still an afterthought. A minority of 3 % identify the source of delays and just 9 % perform root-cause analysis of application problems. Response time measurement is also widely ignored. One in five organizations is collecting response times metrics. It seems that continuous performance optimization is generously neglected. In this post, I will explain some reasons for this surprising development and give you advises how to avoid those pitfalls.

Improve and forget is a bad advice

Successful businesses learned years ago that slow responding applications are both, a nightmare for support teams and a frustrating experience for users. Naturally, they walked in a valley of tears for some time and were struggling with response time issues.

However, they transformed their software development pipeline towards performance and often considered performance from day one. Non-functional requirements exist from day one, and the whole construction team is aware how to examine these in design, implementation and testing stages. Once their new products passed all performance tests and got deployed into production, the performance suddenly degrades.

Close the loop

Performance engineering is a continuous process. A perfect designed and conducted load and performance test is a risk mitigation that the new application will be able to handle the simulated load under certain data constraints. Both, the workload and the data volume can change quickly in production. So, did you designed performance tests which addressed all those uncertainties?

One of the pitfalls in our performance engineering space is neglecting continuous performance optimization. Certainly, you keep performance considerations during application design and test in mind. You eliminate response time hotspots on pre-production stages. The problem is often that due to changes in the environment, data or user activities, user experience degrades and nobody is aware of this poor development.

Continuous performance engineering requires a closed loop approach. Start early in the life cycle, repeat the measurements regularly and extend performance reviews also into production environments. It makes much sense to share performance metrics collected on production with developers and testers because they can support your troubleshooting and adjust their performance testing approach to the current situation.

All things considered, performance is more a journey than a destination.

Performance Testing in a Dynamic World

In many fields such as finance, engineering or politics there are groundbreaking changes ongoing. However, our human skill to adapt to new situations will help us to deal with this disruptions.

In this post, I will shine a light on challenges in software engineering, more specifically, on load and performance testing in an energetic environment.

What are the difficulties we are facing?

Over many years we’ve planned performance tests in advance. Requirement engineers documented non-functional aspects. Software developer designed and implemented the new system with the requirements in mind. Finally, testing teams verified and validated the requirements and hand the new product over to operation teams.

This stage-by-stage approach disappears more and more in an agile world. Nowadays, a single team is responsible for design, implementation, test and operation of the new product. Excellent collaboration is fundamental to the success of teams operating in this mode. When it comes to load and performance testing the biggest hurdles are time constraints, the frequency of changes and the often just partially available system under test.

What are the pillars of a dynamic performance testing approach?

First of all, you need to work on your application and environment monitoring. If you are not able to capture all transactions on development or production stages, you’ll loose too much time with troubleshooting. Ideally, you integrate a real user, application performance, and component monitoring and you share all metrics with your project members.

Secondly, implement and continually execute service based performance tests. Even if your new system is not completely integrated, it makes sense to evaluate response times of your new services under multi-user load conditions. Provide results of those tests in online dashboards and grant access to the whole team. Set thresholds for your most important performance metrics such as throughput, error rate, response time and clearly communicate any violation.

Finally, don’t forget end-to-end performance tests of the fully integrated application. While service based tests are required to find issues in early stages, the E2E test in a close to production like environment is a final validation and utterly required.

Don’t forget that performance engineering is more a journey than a destination.