Machine Learning and Advanced Statistics for Performance Monitoring

Language: English (translated)

Performance is playing an increasingly important role in the IT sector. Performance monitoring helps to detect whether a network is stable enough, whether an application is fast enough, whether users can be expected to be satisfied with a service, and to answer many similar questions. Key ingredient for monitoring are the right metrics. To create promising metrics that might become valuable Key Performance Indicators (KPIs) one needs data, profound domain knowledge, and suitable maths. In time of analog storage the collection of data might have been a problem, but not anymore. Nowadays data are collected and stored everywhere at any time. This amount of (big) data can be used in several ways. Sophisticate methods, e.g. anomaly detection, make it possible to deal with problems and get to their roots even when the sheer amount of data available would take too much time to analyze by a human admin relying on common practice methods. Goal of this talk is to show the role of advanced statistics and machine learning compared to currentcommon practice when extracting insights from multiple data sources. Advantages when it comes tovisualization, bottleneck detection, and alarm creation are presented via practical examples from theperformance monitoring field. Open Source tools such as for example the SciPy stack, scikit-learn, as well as InfluxDB and Grafana can help to accomplish a minor part of the big task but in particular their combination has a huge potential in this field.