Log Data Analytics

Overview

  • Helpful skills: data analysis, server administration, logging frameworks

Description

Over the course of the last year, the logging stack of our telemetry setup was reworked such that now we have an ELK stack running on Kubernetes with Kibana being accessible at telemetry.terasology.org. While Kibana provides a lot of functionality to draw information from collected warning and error logs, we only use the "Discover" mode at the moment, if at all.

Your Project

** Introducing Analytical Views on Terasology Log Data**

As part of your proposal you should familiarize yourself with the ELK (Elasticsearch, Logstash, Kibana) stack, especially how Logstash parses incoming logs and how to query logs stored in Elasticsearch (e.g. using the Kibana Query Language (KQL)), and the logging in Terasology. Your proposal should include suggestions on how to analyze collected log data in a meaningful way, for instance using dashboard elements aggregating data or putting information into relation to detect patterns and anomalies and identify possible causalities, and changes that might be required within Terasology, for instance to the log format or the selected data that is sent.

Helpful Links Setup Elastic Search, Kibana https://www.elastic.co/guide/en/elastic-stack-get-started/current/get-started-elastic-stack.html

Overview of Logstash (video) https://www.elastic.co/webinars/getting-started-logstash?baymax=rtp&elektra=docs&storm=sidebar1

Setup Logstash https://www.elastic.co/guide/en/logstash/current/installing-logstash.html

Directory structure for different installations https://www.elastic.co/guide/en/logstash/7.11/dir-layout.html

Familiarise yourself with various kibana visualizations. https://www.elastic.co/guide/en/kibana/current/dashboard.html

At this point, you should be able to create a basic pipeline that takes input from a file and outputs it to elastic search which you can view from kibana along with a simple dashboard that you’ve created.

Once you’re familiar with the ELK side of things, you might want to familiarise yourself with the telemetry system we have in game. https://github.com/Terasology/TutorialTelemetry https://github.com/Terasology/TutorialTelemetry/wiki https://github.com/GabrielXia/telemetry/wiki

Your project will include:

  • producing test data
  • configuring Kibana, Elasticsearch and Logstash
  • designing and configuring a suitable log format
  • integrating more types of log information to analyze with the collection
  • GDPR and Data Privacy and Protection (DPP) compliance
  • Lifecycle policy management,
  • Right to be forgotten (figuring out a way to delete logs of a particular user)
  • Anonymizing information.
  • Tying sessions together by generating a random UUID for each session
  • Visualizations that provide interesting insights that might be useful to the community - Try and ask people working on different aspects of the game about what would benefit them.

Basic experience with the ELK stack or other logging / analytics frameworks is recommended. Knowledge about DPP is beneficial.