Reaching the Summit of Web Performance with – Part 2

In Part 1 of the interview, Uwe Beßle and Nils Kuhn of iteratec gave us an overview of the OpenSpeedMonitor, a tool for measuring the performance of websites. In this second part, we have a look at how the OpenSpeedMonitor/WebPagetest environment is used at OTTO.

Oliver: How do we use the OpenSpeedMonitor at OTTO, and how does it help our agile development teams?

Nils and Uwe: The OpenSpeedMonitor is part of the WPT measurement environment that iteratec built specifically for OTTO. We take care of its operation on an ongoing basis.

WPT measurement environment for OTTO
Figure 1: WPT measurement environment for OTTO

Alongside the OpenSpeedMonitor, there are three WebPagetest servers for regular monitoring measurements and for individual analysis measurements. Each WebPagetest server controls a number of agents and holds the results of each individual measurement. For the measurements themselves, there are six agents with popular desktop browsers (Firefox, Chrome and Internet Explorer) and six mobile agents (three iOS and three Android devices).

With this equipment we can regularly measure key pages on, and also have precisely the same measurements available for the most important test environments. During a Sprint, the teams can then do early-stage testing and analysis of how the latest developments impact page-load performance.

In the regular measurements of the performance of, a separate team takes care of evaluating the results on a day-to-day basis. In the event of any abnormalities in measurements, the agile development teams are brought on board, enabling them to analyze any deterioration in performance metrics right from the start.

Another important part of our monitoring is continuous benchmarking of compaired to main market competitors.

Oliver: In the Lhotse project, we completed a performance sprint as a way of accomplishing our ambitious performance targets. Could you tell us briefly how the sprint went, how the monitor helped us, and what impact it had on ongoing development?  

Nils and Uwe: The performance sprint was certainly a learning experience, and I would not have missed it on any account. For one thing, it is an impressive testament of what can be achieved through agile approaches in software development, especially in such a large project as Lhotse. Thanks again to all the teams that delivered such remarkable improvements on performance on all vertical systems within just a few days.

Daily performance improvements during the performance sprint
Figure 2: Daily performance improvements during the performance sprint

The success would have been impossible without the careful groundwork beforehand – from elaborating the parameters of the Customer Satisfaction Index (CSI), establishing a dedicated WebPagetest measurement environment for OTTO, to taking the OpenSpeedMonitor live for regular measurements. Thanks to the CSI, we knew what goals we needed to achieve on performance. And thanks to our regular measurements, we always knew exactly where we stood on any one aspect, and where to find the greatest potential for optimising the client performance of our pages.

As a result, within just a few days we were able to prepare the kick-off for the Performance Sprint, at which we presented clear-cut objectives, problem analysis and solution pathways to the teams. The teams could then go about their work knowing precisely what goals they were working toward. Highly detailed metrics were delivered on an ongoing daily basis during the Sprint, enabling us to keep close tabs on progress and chart improvements in accessible visualised formats, such as with before/after videos.

However, the performance sprint was also instructive for us in another respect. The need to conduct a performance sprint in first place demonstrated there is little use measuring performance regularly if there is no one to actually evaluate the results every single day and takes action, if something goes wrong. We had the measurements established long before the performance sprint, and all teams had access to the data. Nevertheless, in the weeks prior to the performance sprint, more and more performance degradations had crept in. We learned from this, and today we evaluate data on a regular basis and are proactive in resolving problems.

Oliver: How much effort should companies such as OTTO be investing in optimising the performance of their websites?

Nils and Uwe: That’s the million-dollar question, and it is not an easy one to answer. If I under-invest and tolerate very poor page performance, that can completely sideline my site or shop. If I over-invest, I am at the other end of the curve, where I can squander any amount of money on optimising performance without any significant improvement in site use by customers and users.

As a basic rule, it’s good to look outside the box, see what others are doing. If my performance is considerably worse than my main competitors, I have to invest to catch up.

As long as I’m more or less on par with my main competitors, it’s generally worth investing more, because at this level there is usually a linear correlation between effort, performance and customer buying.

If I am already way ahead of my competitors, it is generally not worth upping the investment in performance optimisation. Regular measurements are still necessary so as to maintain this lead.

The other important benchmark are customer expectations, which we use as base for our Customer Satisfaction Index (CSI). As the last customer survey has shown, these expectations are rising.

Increasing customer performance expectations from 2012 to 2014
Figure 3: Increasing customer performance expectations from 2012 to 2014

Oliver: How do the measurements differ from Real User Monitoring (RUM)? Is RUM not sufficient to monitor performance? After all, only a real browser can reflect the reality of the customer.

Nils and Uwe: Yes, this question has already been the subject of much, sometimes intensive debate. But the way we see it, this is the wrong question to ask. You do not need either synthetic monitoring or real user monitoring. To monitor performance of bespoke software in a responsible way, you need both synthetic and real user monitoring and at OTTO we do both. The next graphic shows a comparison of our synthetic and our real user monitoring.

Comparison between Real User Monitoring Median (green) and syntetic WPT monitoring (red and yellow)
Figure 4: Comparison between Real User Monitoring Median (green) and syntetic WPT monitoring (red and yellow)

So the real question is deciding how each should be deployed. What should synthetic monitoring be used for, and where should we apply real user monitoring?

Real user monitoring is the only right way to monitor how customers on actually experience the performance of page load times. Any attempt to recreate the diversity of real-life customer scenarios using nothing more than synthetic monitoring is doomed to fail.

But if you want to analyse why performance is good or bad, and whether it is down due to software changes, SPAM congestion on the internet or new versions of the Firefox browser, you’ll find yourself left high and dry if you are just using real user monitoring without synthetic monitoring to deliberately control and investigate the variability of the many factors that have an impact on performance. In addition, synthetic monitoring gives me the possibility to capture far more detailed and extensive data:

  • Detailed data on each individual request
  • Screenshots and even videos with 60 fps documenting how the displayed page content is gradually built
  • Request and Response Headers
  • Values ​​on CPU utilization
  • etc.

In short, synthetic monitoring can tell us whether we have done everything right in developing and running and where there is potential for improvement. Real user monitoring can then serve to investigate what is actually hitting the mark with customers, and how.

Oliver: Is the software already advanced enough to measure correlations between the performance of websites and their economic KPIs?

Nils and Uwe: No, this won’t work with OpenSpeedMonitor because these are all syntetic measurements. This is more related to real user monitoring. You have to correlate your RUM metrics with you business KPI’s – e.g. in your online analytics tool. This will generate quite interesting results.

This interview will be continued

  • Part 3 : Whats next ? – OpenSpeedMonitor Roadmap