Data Science (in the Real World)

23.09.2020 | Language: english

Research and technology trends in Big Data and AI

We are in the process of preparing our fourth COMET application and are looking for partners to jointly explore the potential of these key technologies:
Learn more about the advantages and opportunities of a COMET partnership and secure your access to funded top-level research.

Target group:

People interested in data science (researchers, practitioners, …)


Data is the new oil. Similar to raw oil directly from the well, raw data also cannot be used to fuel machine learning algorithms. Instead, it needs to be carefully refined and the precious useful information needs to be separated from irrelevant noisy information. In this installment of the workshop series, we shed light on the importance of data quality assessments, data preprocessing, and knowledge transfer between domain experts and data scientists. In addition, we will discuss a selection of pitfalls and even paradoxical data science results. We will not only acknowledge their existence but also aim to provide practical advice on how to handle situations like skewed datasets, for example, cases where there are only a few examples in a dataset of a potentially undesired phenomenon.

After the event you will know:

  • Data quality concerns & KPIs

  • Code & data books

  • Paradoxical data science results

  • Data cleaning & preprocessing

  • Data validation

  • Model debugging

  • Non-linear correlation & correlation does not imply causation

  • Feature selection & outlier detection

  • Machine learning with skewed & imbalanced data


Roman Kern

Roman Kern

Research Area Manager Knowledge Discovery

Oliver Pimas

Oliver Pimas

Big Data Lab

Mario Lovric

Mario Lovric

Knowledge Discovery