Application Performance Monitoring and how to use your monitoring metrics
One of the biggest players in the application-performance-monitoring industry recently went to the stock market. As the company is the market leader, having left all competition well behind, this move came as no surprise. About two weeks ago the same company announced another killer feature to be integrated into their outstanding cloud-native monitoring suite. Based on my experience, APM is key for all companies to deliver better IT services to their customers.
In this post I’ll share with you some outstanding use cases for application performance monitoring.
The vast majority of our applications are designed to provide a better user experience and simplify complex processes. We’re all facing strong competition, so capturing the attention of our customers is a growing challenge. Outstanding products combined with the best presentation certainly make a great start – but we all know how things can quickly turn sour. Once user experience suffers or user behavior changes, you should be alerted to implement the required improvements. Don’t just wait for customer feedback because this will often come too late. Also, research shows that only 10 percent of our clients actually take the time to report a negative user experience. The others just stop using the services.
- Use the latest user-experience metrics such as Visually Complete or Speed Index
- Act immediately if the user experience drops
- Remember that geographical location can have a massive impact
Hardware is now relatively cheap compared to the investments we make in the latest business applications. The flip side of this development is that we often forget to keep resource utilization in mind when programing new apps. We simply skip the entire design phase and start coding the features. When such an application is deployed to your customers, you’ll be very surprised at how slow loading it is. You’ll need more hardware and the scalability could be far beyond your expectations. Alternatively, the continuous monitoring of all service flows and benchmarking of capacity requirements during the development, performance-testing and production stages will give your team the all-important transparency about the actual needs.
- Avoid both over-sizing and under-sizing
- Size your stack from an application perspective
- Don’t use hardware to compensate for poor design
No one wants to read auto-generated emails such as incident-alert notifications. These make such tedious reading and are often just ignored anyway. Why waste valuable time when there’s a much better strategy? By using AI and machine learning, you can analyze the issues and automatically detect the root cause of incidents, then present this information to your engineers. There’s no reason for incidents still to be sent out – utilize the available technology instead and let the algorithm do all the analysis and simplification. Don’t use your costly human resources on such tasks. In fact nobody is even capable of checking all the dependencies, which can often number several million in our complex work environments.
Automated-problem detection and root-cause analysis are here to stay. These tools completely change the way we investigate issues, they reduce the alert storms generated on a daily basis and push your employees to more creative tasks such as innovation and automation.
- Use AI and machine learning to automatically detect the root cause of problems
- Reduce automated alert storms dramatically by utilizing this problem reporting
Drawing a data-flow diagram for applications gives you a better idea of the volume of requests that need to be processed by a certain component. Such flow diagrams are essential but do you think the coding always reflects such static dependency mapping? Imagine a scenario where an application reads all customers that have purchased a certain airline flight from a database. The developer has several ways of coding this requirement. In the worst case, he reads the records line-by-line from the database. Obviously, this laborious process would have a massive impact on the overall user experience and the more customers that were listed in the database, the longer the processing would take.
Automated Design Validation puts you in the driver’s seat. Use APM and transaction-tracing tools to create a service flow, and then check whether the whiteboard drawing matches the actual situation. In many cases you’ll find that something went wrong or that far more requests are being processed than expected. Make sure you fix such issues; otherwise they’ll hit you during the load and performance testing.
- Flow diagrams and dependency mapping are both important
- Whiteboard drawings alone are not to be trusted
- Automated design validation using real service flows provide all the facts
Although load generation and response-time validation represent an essential starting point, they won’t provide you with insights into the real hotspots in your IT services. Those who try using system-resource monitoring solutions in order to find the cause of a slowdown in an application soon learn that this route leads to a dead-end. Running a performance test over and over again is no better for uncovering a hotspot either. The fact that business services are now widely distributed across hundreds of components means that even log-file monitoring is no longer effective in detecting the cause of a bottleneck.
The only effective working method is to have a full-stack monitoring solution in place that traces all transactions across all components and hosts involved. No matter which element is causing the slowdown, you’ll see how the requests flow through your application and which layer is causing the hotspot. You’ll need to start the horizontal analysis from an end-user’s perspective and follow it to the service or infrastructure component that’s taking the most time. Proceed with the vertical analysis to see what needs to be done in your application stack.
- Full-stack monitoring and transaction tracing are key for any tuning exercise
- Start with horizontal analysis to identify the problematic layer
- Proceed with vertical analysis to identify what needs to be improved in your stack
One of the often overlooked use-cases for APM is the real-time insight into the dynamics of your ongoing operations. People might expect to have reliable IT services practically out-of-the-box, but the first thing to look at is how any business application will create value for your company. Imagine your team is in charge of a trading portal and you agree with them on placing a fee per transaction. You can wait till the month or quarter-end reporting to see all the numbers that will show if your team is on track. If it isn’t, you should consider using a more advanced business analytics-based approach.
It’s all about how quick you are in taking the required decisions. Real-time business-analytics details will support your decision-making process and place you at the head of the pack. The chance to view all the business figures on your monitoring cockpit will enable you to take the best decision that truly supports your business in a very positive way.
- Use APM to provide real-time information on the dynamics of your business
- Chart business figures on information radiatorsUse APM to provide real-time information on the dynamics of your business
- Let APM support your decision-making process through business analytics
Do you have questions about APM? I’ll be happy to demonstrate how APM can simplify your daily work and leave you free to focus on innovation and optimization.
Happy Performance Engineering!