Institute for Program Structures and Data Organization (IPD), Chair Prof. Böhm

Big Data Analytics


Recording and Stream

Further information and all documents can be found in the ILIAS course of the lecture.


Techniques for the analysis of large data sets meet with great interest among users. The spectrum is broad and includes classical branches such as banks and insurance companies, newer players, especially internet companies or operators of novel information services and social media, and natural and engineering sciences. In all cases, there is a desire to maintain an overview in very large, sometimes distributed data sets, to extract interesting relationships from the data set with as little effort as possible, and to be able to systematically compare expected system behaviour with the actual system behaviour. The lecture deals with the preparation of data as a prerequisite for a fast and efficient analysis as well as with modern techniques for the analysis itself.


At the end of the course, the participants should have a good understanding and be able to explain the necessity of concepts of data analysis. They should be able to assess and compare different approaches to the management and analysis of large data sets with regard to their effectiveness and applicability. The participants should understand which problems are currently open in the subject area of the lecture and should have gained an insight into the state of research in this field.


  • Data Mining: Practical Machine Learning Tools and Techniques (3rd edition): Ian H. Witten, Eibe Frank, mark A. Hall, Morgan Kaufmann Publishers 2011
  • Data Mining: Concepts and Techniques (3rd edition): Jiawei Han, Micheline Kamber, Jian Pei, Morgan Kaufmann Publishers 2011
  • Knowledge Discovery in Databases: Martin Ester, Jörg Sander, Springer 2000