Institute for Program Structures and Data Organization (IPD), Chair Prof. Böhm

Practical Course: Analysis of Complex Data Sets


When participating in this lab course, individuals are expected to deploy their knowledge gained in the lecture "Big Data Analytics" or elsewhere in practice. The students will meet software tools that are popular in data analysis and use them in real-world use cases. The first part of the lab course intends to let students familiarize themselves with the preprocessing of raw data and the anaysis steps of the KDD process. The students are expected to learn to arrive at optimal results in a certain use case using commercially available tools. The second part of the course will focus on one specific analysis step and its weaknesses. Students will be confronted with problems where solutions have not yet been described in the scientific literature and learn to design solutions to such ends. In addition, students are supposed to work in teams in order to solve their assignments successfully.


In the lab course „Analyzing Big Data“, the theoretical knowledge from the lecture “Big Data Analytics” will be engrossed using popular software tools. The course consists of two blocks: One regarding the current state of the art and one going beyond that, with research questions that are currently open. In the first block, an example of use for knowledge extraction and data exploration in a company is played through, in the spirit of the KDD process. This will shed light on different data mining paradigms. The focus is on clustering, classification, as well as on the computation of frequent itemsets and association rules. In the second block, one step of the KDD processes and the respective weaknesses of the state of the art are studied. Participants will be sensibilized for these open problems and will be taught on how to develop own solutions to these questions. Both the example of use and the open research questions will be worked on in teams.