Publikationen

Hier finden Sie von Know-Center MitarbeiterInnen verfasste wissenschaftliche Publikationen

2010

Kern Roman, Seifert Christin, Granitzer Michael

A Hybrid System for German Encyclopedia Alignment

International Journal on Digital Libraries, Springer, 2010

Journal
Collaboratively created on-line encyclopediashave become increasingly popular. Especially in terms ofcompleteness they have begun to surpass their printedcounterparts. Two German publishers of traditional encyclopediashave reacted to this challenge and started aninitiative to merge their corpora to create a single, more completeencyclopedia. The crucial step in this merging processis the alignment of articles. We have developed a two-stephybrid system to provide high-accurate alignments with lowmanual effort. First, we apply an information retrieval based,automatic alignment algorithm. Second, the articles with alow confidence score are revised using a manual alignmentscheme carefully designed for quality assurance. Our evaluationshows that a combination of weighting and rankingtechniques utilizing different facets of the encyclopedia articlesallow to effectively reduce the number of necessary manualalignments. Further, the setup of the manual alignment turned out to be robust against inter-indexer inconsistencies.As a result, the developed system empowered us to align fourencyclopedias with high accuracy and low effort.
2010

Seifert C., Granitzer Michael

User-based active learning

International Conference on Data Mining Workshops (Workshop on Visual Analytics and Knowledge Discovery), Fan, W., Hsu, W.,Webb, G. I., Liu, B., Zhang, C., Gunopulos, D., Wu, X., IEEE, 2010

Konferenz
Active learning has been proven a reliable strategyto reduce manual efforts in training data labeling. Suchstrategies incorporate the user as oracle: the classifier selectsthe most appropriate example and the user provides the label.While this approach is tailored towards the classifier, moreintelligent input from the user may be beneficial. For instance,given only one example at a time users are hardly ableto determine whether this example is an outlier or not. Inthis paper we propose user-based visually-supported activelearning strategies that allow the user to do both, selectingand labeling examples given a trained classifier. While labelingis straightforward, selection takes place using a interactivevisualization of the classifier’s a-posteriori output probabilities.By simulating different user selection strategies we show,that user-based active learning outperforms uncertainty basedsampling methods and yields a more robust approach ondifferent data sets. The obtained results point towards thepotential of combining active learning strategies with resultsfrom the field of information visualization.
2010

Kienreich Wolfgang, Seifert C.

An Application of Edge Bundling Techniques to the Visualization of Media Analysis Results

IV2010: International Conference on Information Visualization, IEEE Computer Society Press, 2010

Konferenz
The advent of consumer-generated and socialmedia has led to a continuous expansion and diversificationof the media landscape. Media consumers frequently findthemselves assuming the role of media analysts in order tosatisfy personal information needs. We propose to employKnowledge Visualization methods in support of complex mediaanalysis tasks. In this paper, we describe an approach whichdepicts semantic relationships between key political actorsusing node-link diagrams. Our contribution comprises a forcedirectededge bundling algorithm which accounts for semanticproperties of edges, a technical evaluation of the algorithmand a report on a real-world application of the approach. Theresulting visualization fosters the identification of high-leveledge patterns which indicate strong semantic relationships. Ithas been published by the Austrian Press Agency APA in 2009.
2010

Sabol Vedran, Granitzer Michael, Seifert C.

Classifier Hypothesis Generation Using Visual Analysis Methods

NDT: Networked Digital Technologies, Springer, 2010

Konferenz
Classifiers can be used to automatically dispatch the abundanceof newly created documents to recipients interested in particulartopics. Identification of adequate training examples is essential forclassification performance, but it may prove to be a challenging task inlarge document repositories. We propose a classifier hypothesis generationmethod relying on automated analysis and information visualisation.In our approach visualisations are used to explore the document sets andto inspect the results of machine learning methods, allowing the user toassess the classifier performance and adapt the classifier by graduallyrefining the training set.
2010

Beham Günter, Lindstaedt Stefanie , Ley Tobias, Kump Barbara, Seifert C.

MyExperiences: Visualizing Evidence in an Open Learner Model

Adjunct Proceedings of the 18th Conference on User Modeling, Adaptation, and Personaization, Posters and Demonstrations, Bohnert, B., Quiroga, L. M., 2010

Journal
When inferring a user’s knowledge state from naturally occurringinteractions in adaptive learning systems, one has to makes complexassumptions that may be hard to understand for users. We suggestMyExperiences, an open learner model designed for these specificrequirements. MyExperiences is based on some of the key design principles ofinformation visualization to help users understand the complex information inthe learner model. It further allows users to edit their learner models in order toimprove the accuracy of the information represented there.
2010

Sabol Vedran, Kienreich Wolfgang, Seifert C.

Integrating Node-Link-Diagrams and Information Landscapes: A Path-Finding Approach

Poster and Demo at EuroVis 2010, 2010

Konferenz
2010

Sabol Vedran, Kienreich Wolfgang, Seifert C.

Stress Maps: Analysing Local Phenomena in Dimensionality Reduction Based Visualizations

European Symposium Visual Analytics Science and Technology (EuroVAST), 2010

Konferenz
2010

Lex Elisabeth, Granitzer Michael, Juffinger A., Seifert C.

Efficient Cross-Domain Classification of Weblogs

International Journal of Intelligent Computing Research (IJICR), Vol.1, Issue 2, Infonomics Society, 2010

Journal
Text classification is one of the core applicationsin data mining due to the huge amount ofuncategorized textual data available. Training a textclassifier results in a classification model that reflectsthe characteristics of the domain it was learned on.However, if no training data is available, labeled datafrom a related but different domain might be exploitedto perform cross-domain classification. In our work,we aim to accurately classify unlabeled weblogs intocommonly agreed upon newspaper categories usinglabeled data from the news domain. The labeled newsand the unlabeled blog corpus are highly dynamicand hourly growing with a topic drift, so theclassification needs to be efficient. Our approach is toapply a fast novel centroid-based text classificationalgorithm, the Class-Feature-Centroid Classifier(CFC), to perform efficient cross-domainclassification. Experiments showed that thisalgorithm achieves a comparable accuracy thank-Nearest Neighbour (k-NN) and Support VectorMachines (SVM), yet at linear time cost for trainingand classification. We investigate the classifierperformance and generalization ability using aspecial visualization of classifiers. The benefit of ourapproach is that the linear time complexity enables usto efficiently generate an accurate classifier,reflecting the topic drift, several times per day on ahuge dataset.
Kontakt Karriere

Hiermit erkläre ich ausdrücklich meine Einwilligung zum Einsatz und zur Speicherung von Cookies. Weiter Informationen finden sich unter Datenschutzerklärung

The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.

Close