Schrunner Stefan, Geiger Bernhard, Zernig Anja, Kern Roman
2020
Classification has been tackled by a large number of algorithms, predominantly following a supervised learning setting. Surprisingly little research has been devoted to the problem setting where a dataset is only partially labeled, including even instances of entirely unlabeled classes. Algorithmic solutions that are suited for such problems are especially important in practical scenarios, where the labelling of data is prohibitively expensive, or the understanding of the data is lacking, including cases, where only a subset of the classes is known. We present a generative method to address the problem of semi-supervised classification with unknown classes, whereby we follow a Bayesian perspective. In detail, we apply a two-step procedure based on Bayesian classifiers and exploit information from both a small set of labeled data in combination with a larger set of unlabeled training data, allowing that the labeled dataset does not contain samples from all present classes. This represents a common practical application setup, where the labeled training set is not exhaustive. We show in a series of experiments that our approach outperforms state-of-the-art methods tackling similar semi-supervised learning problems. Since our approach yields a generative model, which aids the understanding of the data, it is particularly suited for practical applications.
Santos Tiago, Schrunner Stefan, Geiger Bernhard, Pfeiler Olivia, Zernig Anja, Kaestner Andre, Kern Roman
2019
Semiconductor manufacturing is a highly innovative branch of industry, where a high degree of automation has already been achieved. For example, devices tested to be outside of their specifications in electrical wafer test are automatically scrapped. In this paper, we go one step further and analyze test data of devices still within the limits of the specification, by exploiting the information contained in the analog wafermaps. To that end, we propose two feature extraction approaches with the aim to detect patterns in the wafer test dataset. Such patterns might indicate the onset of critical deviations in the production process. The studied approaches are: 1) classical image processing and restoration techniques in combination with sophisticated feature engineering and 2) a data-driven deep generative model. The two approaches are evaluated on both a synthetic and a real-world dataset. The synthetic dataset has been modeled based on real-world patterns and characteristics. We found both approaches to provide similar overall evaluation metrics. Our in-depth analysis helps to choose one approach over the other depending on data availability as a major aspect, as well as on available computing power and required interpretability of the results.
Geiger Bernhard, Schrunner Stefan, Kern Roman
2019
Schrunner and Geiger have contributed equally to this work.
Schrunner Stefan, Bluder Olivia, Zernig Anja, Kaestner Andre, Kern Roman
2017
In semiconductor industry it is of paramount im- portance to check whether a manufactured device fulfills all quality specifications and is therefore suitable for being sold to the customer. The occurrence of specific spatial patterns within the so-called wafer test data, i.e. analog electric measurements, might point out on production issues. However the shape of these critical patterns is unknown. In this paper different kinds of process patterns are extracted from wafer test data by an image processing approach using Markov Random Field models for image restoration. The goal is to develop an automated procedure to identify visible patterns in wafer test data to improve pattern matching. This step is a necessary precondition for a subsequent root-cause analysis of these patterns. The developed pattern ex- traction algorithm yields a more accurate discrimination between distinct patterns, resulting in an improved pattern comparison than in the original dataset. In a next step pattern classification will be applied to improve the production process control.