Publikationen

Hier finden Sie von Know-Center MitarbeiterInnen verfasste wissenschaftliche Publikationen

2017

Seifert Christin, Bailer Werner, Orgel Thomas, Gantner Louis, Kern Roman, Ziak Hermann, Petit Albin, Schlötterer Jörg, Zwicklbauer Stefan, Granitzer Michael

Ubiquitous Access to Digital Cultural Heritage

Journal on Computing and Cultural Heritage (JOCCH) - Special Issue on Digital Infrastructure for Cultural Heritage, Part 1, Roberto Scopign, ACM, New York, NY, US, 2017

Journal
The digitization initiatives in the past decades have led to a tremendous increase in digitized objects in the cultural heritagedomain. Although digitally available, these objects are often not easily accessible for interested users because of the distributedallocation of the content in different repositories and the variety in data structure and standards. When users search for culturalcontent, they first need to identify the specific repository and then need to know how to search within this platform (e.g., usageof specific vocabulary). The goal of the EEXCESS project is to design and implement an infrastructure that enables ubiquitousaccess to digital cultural heritage content. Cultural content should be made available in the channels that users habituallyvisit and be tailored to their current context without the need to manually search multiple portals or content repositories. Torealize this goal, open-source software components and services have been developed that can either be used as an integratedinfrastructure or as modular components suitable to be integrated in other products and services. The EEXCESS modules andcomponents comprise (i) Web-based context detection, (ii) information retrieval-based, federated content aggregation, (iii) meta-data definition and mapping, and (iv) a component responsible for privacy preservation. Various applications have been realizedbased on these components that bring cultural content to the user in content consumption and content creation scenarios. Forexample, content consumption is realized by a browser extension generating automatic search queries from the current pagecontext and the focus paragraph and presenting related results aggregated from different data providers. A Google Docs add-onallows retrieval of relevant content aggregated from multiple data providers while collaboratively writing a document. Theserelevant resources then can be included in the current document either as citation, an image, or a link (with preview) withouthaving to leave disrupt the current writing task for an explicit search in various content providers’ portals.
2014

Granitzer MIchael, Veas Eduardo Enrique, Seifert C.

Linked Data Query Wizard: A Novel Interface for Accessing SPARQL Endpoints.

LDOW, 2014

Konferenz
In an interconnected world, Linked Data is more importantthan ever before. However, it is still quite di cult to accessthis new wealth of semantic data directly without havingin-depth knowledge about SPARQL and related semantictechnologies. Also, most people are currently used to consumingdata as 2-dimensional tables. Linked Data is by de -nition always a graph, and not that many people are used tohandle data in graph structures. Therefore we present theLinked Data Query Wizard, a web-based tool for displaying,accessing, ltering, exploring, and navigating Linked Datastored in SPARQL endpoints. The main innovation of theinterface is that it turns the graph structure of Linked Datainto a tabular interface and provides easy-to-use interactionpossibilities by using metaphors and techniques from currentsearch engines and spreadsheet applications that regular webusers are already familiar with.
2014

Stegmaier Florian, Seifert Christin, Kern Roman, Höfler Patrick, Bayerl Sebastian, Granitzer Michael, Kosch Harald, Lindstaedt Stefanie , Mutlu Belgin, Sabol Vedran, Schlegel Kai

Unleashing semantics of research data

Specifying Big Data Benchmarks, Springer, Berlin, Heidelberg, 2014

Buch
Research depends to a large degree on the availability and quality of primary research data, i.e., data generated through experiments and evaluations. While the Web in general and Linked Data in particular provide a platform and the necessary technologies for sharing, managing and utilizing research data, an ecosystem supporting those tasks is still missing. The vision of the CODE project is the establishment of a sophisticated ecosystem for Linked Data. Here, the extraction of knowledge encapsulated in scientific research paper along with its public release as Linked Data serves as the major use case. Further, Visual Analytics approaches empower end users to analyse, integrate and organize data. During these tasks, specific Big Data issues are present.
2011

Kern Roman, Seifert Christin, Zechner Mario, Granitzer Michael

Vote/Veto Meta-Classifier for Authorship Identification

CLEF 2011: Proceedings of the 2011 Conference on Multilingual and Multimodal Information Access Evaluation (Lab and Workshop Notebook Papers), Amsterdam, The Netherlands, 2011

For the PAN 2011 authorship identification challenge we have developeda system based on a meta-classifier which selectively uses the results ofmultiple base classifiers. In addition we also performed feature engineering basedon the given domain of e-mails. We present our system as well as results on theevaluation dataset. Our system performed second and third best in the authorshipattribution task on the large data sets, and ranked middle for the small data set inthe attribution task and in the verification task.
2011

Seifert Christin, Ulbrich Eva Pauline, Granitzer Michael

Word Clouds for Efficient Document Labeling

The Fourteenth International Conference on Discovery Science (DS 2011), Lecture Notes in Computer Science, Springer, 2011

Konferenz
In text classification the amount and quality of training datais crucial for the performance of the classifier. The generation of trainingdata is done by human labelers - a tedious and time-consuming work. Wepropose to use condensed representations of text documents instead ofthe full-text document to reduce the labeling time for single documents.These condensed representations are key sentences and key phrases andcan be generated in a fully unsupervised way. The key phrases are presentedin a layout similar to a tag cloud. In a user study with 37 participantswe evaluated whether document labeling with these condensedrepresentations can be done faster and equally accurate by the humanlabelers. Our evaluation shows that the users labeled word clouds twiceas fast but as accurately as full-text documents. While further investigationsfor different classification tasks are necessary, this insight couldpotentially reduce costs for the labeling process of text documents.
2011

Granitzer Michael, Kienreich Wolfgang, Seifert Christin

Visualizing Text Classification Models with Voronoi Word Clouds

Proceedings 15th International Conference Information Visualisation (IV), 2011

Journal
2010

Lex Elisabeth, Granitzer Michael, Juffinger A., Seifert C.

Efficient Cross-Domain Classification of Weblogs

International Journal of Intelligent Computing Research (IJICR), Vol.1, Issue 2, Infonomics Society, 2010

Journal
Text classification is one of the core applicationsin data mining due to the huge amount ofuncategorized textual data available. Training a textclassifier results in a classification model that reflectsthe characteristics of the domain it was learned on.However, if no training data is available, labeled datafrom a related but different domain might be exploitedto perform cross-domain classification. In our work,we aim to accurately classify unlabeled weblogs intocommonly agreed upon newspaper categories usinglabeled data from the news domain. The labeled newsand the unlabeled blog corpus are highly dynamicand hourly growing with a topic drift, so theclassification needs to be efficient. Our approach is toapply a fast novel centroid-based text classificationalgorithm, the Class-Feature-Centroid Classifier(CFC), to perform efficient cross-domainclassification. Experiments showed that thisalgorithm achieves a comparable accuracy thank-Nearest Neighbour (k-NN) and Support VectorMachines (SVM), yet at linear time cost for trainingand classification. We investigate the classifierperformance and generalization ability using aspecial visualization of classifiers. The benefit of ourapproach is that the linear time complexity enables usto efficiently generate an accurate classifier,reflecting the topic drift, several times per day on ahuge dataset.
2010

Sabol Vedran, Kienreich Wolfgang, Seifert C.

Integrating Node-Link-Diagrams and Information Landscapes: A Path-Finding Approach

Poster and Demo at EuroVis 2010, 2010

Konferenz
2010

Kern Roman, Seifert Christin, Granitzer Michael

A Hybrid System for German Encyclopedia Alignment

International Journal on Digital Libraries, Springer, 2010

Journal
Collaboratively created on-line encyclopediashave become increasingly popular. Especially in terms ofcompleteness they have begun to surpass their printedcounterparts. Two German publishers of traditional encyclopediashave reacted to this challenge and started aninitiative to merge their corpora to create a single, more completeencyclopedia. The crucial step in this merging processis the alignment of articles. We have developed a two-stephybrid system to provide high-accurate alignments with lowmanual effort. First, we apply an information retrieval based,automatic alignment algorithm. Second, the articles with alow confidence score are revised using a manual alignmentscheme carefully designed for quality assurance. Our evaluationshows that a combination of weighting and rankingtechniques utilizing different facets of the encyclopedia articlesallow to effectively reduce the number of necessary manualalignments. Further, the setup of the manual alignment turned out to be robust against inter-indexer inconsistencies.As a result, the developed system empowered us to align fourencyclopedias with high accuracy and low effort.
2010

Seifert C., Granitzer Michael

User-based active learning

International Conference on Data Mining Workshops (Workshop on Visual Analytics and Knowledge Discovery), Fan, W., Hsu, W.,Webb, G. I., Liu, B., Zhang, C., Gunopulos, D., Wu, X., IEEE, 2010

Konferenz
Active learning has been proven a reliable strategyto reduce manual efforts in training data labeling. Suchstrategies incorporate the user as oracle: the classifier selectsthe most appropriate example and the user provides the label.While this approach is tailored towards the classifier, moreintelligent input from the user may be beneficial. For instance,given only one example at a time users are hardly ableto determine whether this example is an outlier or not. Inthis paper we propose user-based visually-supported activelearning strategies that allow the user to do both, selectingand labeling examples given a trained classifier. While labelingis straightforward, selection takes place using a interactivevisualization of the classifier’s a-posteriori output probabilities.By simulating different user selection strategies we show,that user-based active learning outperforms uncertainty basedsampling methods and yields a more robust approach ondifferent data sets. The obtained results point towards thepotential of combining active learning strategies with resultsfrom the field of information visualization.
2010

Kienreich Wolfgang, Seifert C.

An Application of Edge Bundling Techniques to the Visualization of Media Analysis Results

IV2010: International Conference on Information Visualization, IEEE Computer Society Press, 2010

Konferenz
The advent of consumer-generated and socialmedia has led to a continuous expansion and diversificationof the media landscape. Media consumers frequently findthemselves assuming the role of media analysts in order tosatisfy personal information needs. We propose to employKnowledge Visualization methods in support of complex mediaanalysis tasks. In this paper, we describe an approach whichdepicts semantic relationships between key political actorsusing node-link diagrams. Our contribution comprises a forcedirectededge bundling algorithm which accounts for semanticproperties of edges, a technical evaluation of the algorithmand a report on a real-world application of the approach. Theresulting visualization fosters the identification of high-leveledge patterns which indicate strong semantic relationships. Ithas been published by the Austrian Press Agency APA in 2009.
2010

Sabol Vedran, Granitzer Michael, Seifert C.

Classifier Hypothesis Generation Using Visual Analysis Methods

NDT: Networked Digital Technologies, Springer, 2010

Konferenz
Classifiers can be used to automatically dispatch the abundanceof newly created documents to recipients interested in particulartopics. Identification of adequate training examples is essential forclassification performance, but it may prove to be a challenging task inlarge document repositories. We propose a classifier hypothesis generationmethod relying on automated analysis and information visualisation.In our approach visualisations are used to explore the document sets andto inspect the results of machine learning methods, allowing the user toassess the classifier performance and adapt the classifier by graduallyrefining the training set.
2010

Beham Günter, Lindstaedt Stefanie , Ley Tobias, Kump Barbara, Seifert C.

MyExperiences: Visualizing Evidence in an Open Learner Model

Adjunct Proceedings of the 18th Conference on User Modeling, Adaptation, and Personaization, Posters and Demonstrations, Bohnert, B., Quiroga, L. M., 2010

Journal
When inferring a user’s knowledge state from naturally occurringinteractions in adaptive learning systems, one has to makes complexassumptions that may be hard to understand for users. We suggestMyExperiences, an open learner model designed for these specificrequirements. MyExperiences is based on some of the key design principles ofinformation visualization to help users understand the complex information inthe learner model. It further allows users to edit their learner models in order toimprove the accuracy of the information represented there.
2010

Sabol Vedran, Kienreich Wolfgang, Seifert C.

Stress Maps: Analysing Local Phenomena in Dimensionality Reduction Based Visualizations

European Symposium Visual Analytics Science and Technology (EuroVAST), 2010

Konferenz
2009

Granitzer Michael, Rath Andreas S., Kröll Mark, Ipsmiller D., Devaurs Didier, Weber Nicolas, Lindstaedt Stefanie , Seifert C.

Machine Learning based Work Task Classification

Journal of Digital Information Management, 2009

Journal
Increasing the productivity of a knowledgeworker via intelligent applications requires the identification ofa user’s current work task, i.e. the current work context a userresides in. In this work we present and evaluate machine learningbased work task detection methods. By viewing a work taskas sequence of digital interaction patterns of mouse clicks andkey strokes, we present (i) a methodology for recording thoseuser interactions and (ii) an in-depth analysis of supervised classificationmodels for classifying work tasks in two different scenarios:a task centric scenario and a user centric scenario. Weanalyze different supervised classification models, feature typesand feature selection methods on a laboratory as well as a realworld data set. Results show satisfiable accuracy and high useracceptance by using relatively simple types of features.
2009

Lex Elisabeth, Granitzer Michael, Juffinger A., Seifert C.

Automated Blog Classification: A Cross Domain Approach

Proc. of IADIS International Conference WWW/Internet, 2009

Konferenz
2009

Lex Elisabeth, Seifert C.

A Visualization to Investigate and Give Feedback to Classifiers

Poster and Demo at the EuroVis 2009, 2009

Konferenz
2009

Lex Elisabeth, Granitzer Michael, Juffinger A., Seifert C.

Cross-Domain Classification: Trade-Off between Complexity and Accuracy

Proceedings of the 4th International Conference for Internet Technology and Secured Transactions (ICITST) 2009, 2009

Text classification is one of the core applications in data mining due to the huge amount of not categorized digital data available. Training a text classifier generates a model that reflects the characteristics of the domain. However, if no training data is available, labeled data from a related but different domain might be exploited to perform crossdomain classification. In our work, we aim to accurately classify unlabeled blogs into commonly agreed newspaper categories using labeled data from the news domain. The labeled news and the unlabeled blog corpus are highly dynamic and hourly growing with a topic drift, so a trade-off between accuracy and performance is required. Our approach is to apply a fast novel centroid-based algorithm, the Class-Feature-Centroid Classifier (CFC), to perform efficient cross-domain classification. Experiments showed that this algorithm achieves a comparable accuracy than k-NN and is slightly better than Support Vector Machines (SVM), yet at linear time cost for training and classification. The benefit of this approach is that the linear time complexity enables us to efficiently generate an accurate classifier, reflecting the topic drift, several times per day on a huge dataset.
2009

Lex Elisabeth, Seifert C.

A Novel Visualization Approach for Data-Mining-Related Classi?cation

Proceedings of the 13th International Conference on Information Visualisation (IV09), IEEE Computer Society, 2009

Konferenz
2009

Granitzer Michael, Zechner Mario, Seifert C.

Context based Wikipedia Linking

Advances in Focused Retrieval 7th International Workshop of the Initiative for the Evaluation of XML Retrieval (INEX 2008), Geva, S., Kamps, J., Trotman, A., Springer, 2009

Konferenz
2008

Kump Barbara, Kienreich Wolfgang, Granitzer Gisela, Granitzer Michael, Seifert C.

On the beauty and usability of tag clouds

Proceedings of the 12 International Conference on Information Visualization (IV2008), London, UK, July 9-11, 2008, IEEE Computer Society Press, 2008

Konferenz
2008

Kienreich Wolfgang, Lex Elisabeth, Seifert C.

APA Labs: An Experimental Web-Based Platform for the Retrieval and Analysis of News Articles

Proceedings of the first International Conference on the Applications and Digital Information and Web Technologies (ICADIWT08), 2008

Konferenz
2008

Lex Elisabeth, Kienreich Wolfgang, Granitzer Michael, Seifert C.

A generic framework for visualizing the news article domain and its application to real-world data

Journal of Digital Information Management, 2008

Journal
2008

Granitzer Michael, Seifert C., Zechner Mario

Context Resolution Strategies for Automatic Wikipedia Linking

INEX 2008 pre-proceedings, Dagstuhl, Germany, Geva, S., Kamps, J., Trotman, A., Shlomo Geva and Jaap Kamps and Andrew Trotman (Eds.), 2008

Konferenz
Kontakt Karriere

Hiermit erkläre ich ausdrücklich meine Einwilligung zum Einsatz und zur Speicherung von Cookies. Weiter Informationen finden sich unter Datenschutzerklärung

The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.

Close