Transform Performance Metrics to Actionable Insights

Digital services are eating the world. Billions of websites provide almost everything you can image, and the competition is on the rise. End users are more and more in the driving seat because they decide what service they will use.  Digital customers often ban services which provide a bad user experience.  Therefore, excellent application performance becomes a top priority for many organizations. After outlining both, performance metrics and the term actionable insights, I will illustrate how you can transform the former to the latter.

According to technopedia, actionable insights  are

“analytics result that provides enough data to make an informed decision.”

When it comes to performance engineering, specialists collect metrics which help them to understand the root-cause of hotspots. Decades ago it was very common that slowdowns were solved with additional hardware. Those times are gone. Nowadays, engineers have to deal with questions such as:

  • Why is the number of frustrated users on the rise?
  • Why is page speed score below the target?
  • Why does service response time exceeds allowed thresholds?
  • Why is the average size of our web page > 1 MB?
  • Why is our system not scalable?
  • Why is our application running out of memory?

When it comes to answering those questions, you can follow a trial and error based approach or derive the tuning recommendation from your collected performance metrics. I highly recommend following the latter because the former will result in endless war rooms trial and error exercises.

What are the typical performance metrics

1. User layer. Modern real user monitoring solutions provide powerful insights to activities performed by end-user on your applications. They often use javascript injection approach and add small functions to your web pages which allows a detailed capturing of last-mile performance figures such as actions performed, client errors, actual bandwidth, client times, rendering time and much more.

2. Service layer. User interactions lead to service calls and whenever a slowdown arise you should be able to identify the time beeing spend on your middleware. Typically, those service based metrics include response times, throughput, error rates, exceptions, 3rd party service call figures, heap statistics and much more.

3. System layer. Network, CPU, memory and IO metrics are also critical factors for our applications. If they are often above acceptable thresholds, they can quickly influence overall application performance.

Extensive data analytics and data processing

Almost every business application collects some log file data, but if you intend to nail down performance hotspots, the information provided in log files is most likely not useful. You will need each, and every user transaction to be taken into account including your horizontal components such as web server, application server, and databases as well as 3rd party services. Obviously, you will deal with big data volumes. Monitoring and testing platforms are nowadays well equipped to manage millions of such transactions.

Make a proper course of action

Forward-thinking players started with the implementation of performance analytics engines. Their vision is clearly the reduction of operational efforts, and their mission is to make everyone act as a  performance engineer. In the past, specialists were responsible for the correlation of those insights. Such experts are aware of many slow-down patterns, and their radar identifies those quickly.

In the past, experienced engineers transformed performance metrics to actionable insights. However, in recent years automation is slowly taking over. Algorithms decide forthwith about the acceptance of user experience and point out the root-cause of performance slowdowns. Such artificial performance advisers are still in early stages, but there is an immense potential.

Advertisements

The need for Continuous Performance Optimization

Nowadays, information technology is at the heart of every business. Outages or slow downs in critical software components often impacts the whole organization.

According to research from Aberdeen performance is still an afterthought. A minority of 3 % identify the source of delays and just 9 % perform root-cause analysis of application problems. Response time measurement is also widely ignored. One in five organizations is collecting response times metrics. It seems that continuous performance optimization is generously neglected. In this post, I will explain some reasons for this surprising development and give you advises how to avoid those pitfalls.

Improve and forget is a bad advice

Successful businesses learned years ago that slow responding applications are both, a nightmare for support teams and a frustrating experience for users. Naturally, they walked in a valley of tears for some time and were struggling with response time issues.

However, they transformed their software development pipeline towards performance and often considered performance from day one. Non-functional requirements exist from day one, and the whole construction team is aware how to examine these in design, implementation and testing stages. Once their new products passed all performance tests and got deployed into production, the performance suddenly degrades.

Close the loop

Performance engineering is a continuous process. A perfect designed and conducted load and performance test is a risk mitigation that the new application will be able to handle the simulated load under certain data constraints. Both, the workload and the data volume can change quickly in production. So, did you designed performance tests which addressed all those uncertainties?

One of the pitfalls in our performance engineering space is neglecting continuous performance optimization. Certainly, you keep performance considerations during application design and test in mind. You eliminate response time hotspots on pre-production stages. The problem is often that due to changes in the environment, data or user activities, user experience degrades and nobody is aware of this poor development.

Continuous performance engineering requires a closed loop approach. Start early in the life cycle, repeat the measurements regularly and extend performance reviews also into production environments. It makes much sense to share performance metrics collected on production with developers and testers because they can support your troubleshooting and adjust their performance testing approach to the current situation.

All things considered, performance is more a journey than a destination.

Performance Testing in a Dynamic World

In many fields such as finance, engineering or politics there are groundbreaking changes ongoing. However, our human skill to adapt to new situations will help us to deal with this disruptions.

In this post, I will shine a light on challenges in software engineering, more specifically, on load and performance testing in an energetic environment.

What are the difficulties we are facing?

Over many years we’ve planned performance tests in advance. Requirement engineers documented non-functional aspects. Software developer designed and implemented the new system with the requirements in mind. Finally, testing teams verified and validated the requirements and hand the new product over to operation teams.

This stage-by-stage approach disappears more and more in an agile world. Nowadays, a single team is responsible for design, implementation, test and operation of the new product. Excellent collaboration is fundamental to the success of teams operating in this mode. When it comes to load and performance testing the biggest hurdles are time constraints, the frequency of changes and the often just partially available system under test.

What are the pillars of a dynamic performance testing approach?

First of all, you need to work on your application and environment monitoring. If you are not able to capture all transactions on development or production stages, you’ll loose too much time with troubleshooting. Ideally, you integrate a real user, application performance, and component monitoring and you share all metrics with your project members.

Secondly, implement and continually execute service based performance tests. Even if your new system is not completely integrated, it makes sense to evaluate response times of your new services under multi-user load conditions. Provide results of those tests in online dashboards and grant access to the whole team. Set thresholds for your most important performance metrics such as throughput, error rate, response time and clearly communicate any violation.

Finally, don’t forget end-to-end performance tests of the fully integrated application. While service based tests are required to find issues in early stages, the E2E test in a close to production like environment is a final validation and utterly required.

Don’t forget that performance engineering is more a journey than a destination.

How do you Manage Security Risks in Open Source?

Open source is at the heart of almost every application. If you have ever developed a new application from scratch, the chance is very high that you’ve also built this on open source. In this post, I will outline security risks related to open source and give you a mitigation approach.

Reasons for open source

According to Gartner, 99% of mission-critical application portfolios within Global 2000 companies contains open source components. The complexity of our services is increasing. Users expect easy to use and responsive applications. At the same time, IT costs must be reduced. One approach to deal with this growing expectation and limited resources is building new applications on open source libraries which help developers to speed up their construction time.

Implementing critical functions such as encryptions or asynchronous processing can be both, time-consuming and challenging because there are many pitfalls involved. One being the in-depth knowledge of a particular topic which quickly leads to many hours of research. Another one being that the self-made component is erroneous. Therefore, many developers avoid reinventing the wheel and prefer open source components.

Risks

Your applications consist widely on open source libraries. I assume that you have a robust security test concept in place which also includes secure code scans according to industry standards. But, are you also aware of risks introduced by your open source components?

A static application security testing solution is unable to identify vulnerabilities without the actual source code. Typically, you don’t have the source of your open source libraries used in your business applications and your code scan solution will not point out any vulnerabilities within those.

Another often ignored risk are license terms of your open source components. While those libraries are free neglecting to comply with their requirements may result in business and technical risks.

Mitigations

First of all, you should be aware of all open source libraries are used across your applications and development projects. This open source inventory is essential because whenever a breach arises, you can quickly identify the affected application and apply a bugfix.

Secondly, regularly verify the known vulnerabilities in your open source libraries. Whenever you are using out-dated or vulnerable components, you should consider upgrading to the fixed version.

Finally, track what open source licenses you have used in your applications including their dependencies.

There are several secure code scan platforms out there which also provides an integrated solution for open source secure code analysis. Personally, I recommend using the Checkmarx Application Security Testing (CxSAST) solution.