Data Science (in the Real World)

23.09.2020 15:00 - 17:00 | Language: english | Register free of charge for session

Target group:

People interested in data science (researchers, practitioners, …)


Data is the new oil. Similar to raw oil directly from the well, raw data also cannot be used to fuel machine learning algorithms. Instead, it needs to be carefully refined and the precious useful information needs to be separated from irrelevant noisy information. In this installment of the workshop series we shed light on the importance of data quality assessments, data preprocessing and knowledge transfer between domain experts and data scientists. In addition we will discuss a selection of pitfalls and even paradoxical data science results. We will not only acknowledge their existence but aim to also provide practical advice on how to handle situations like skewed datasets, for example cases where there are only a few examples in a dataset of a potentially undesired phenomenon.

After the event you will know:

  • Data quality concerns & KPIs

  • Code & data books

  • Paradoxical data science results

  • Data cleaning & preprocessing

  • Data validation

  • Model debugging

  • Non-linear correlation & correlation does not imply causation

  • Feature selection & outlier detection

  • Machine learning with skewed & imbalanced data


Roman Kern

Roman Kern

Research Area Manager Knowledge Discovery

Oliver Pimas

Oliver Pimas

Big Data Lab

Matthias Böhm

Matthias Böhm

Research Area Manager Data Management