Publikationen

Hier finden Sie von Know-Center MitarbeiterInnen verfasste wissenschaftliche Publikationen

2010

Erol S., Granitzer Michael, Happ S., Jantunen S., Jennings B., Koschmider A., Nurcan S., Rossi D., Schmidt R.

Combining BPM and Social Software: Contradiction or Chance?

Journal of software maintenance and evolution: research and practice, John Wiley & Sons, Ltd., 2010

Journal
2010

Granitzer Michael, Sabol Vedran, Onn K., Lukose D.

Ontology Alignment - A Survey with Focus on Visually Supported Semi-Automatic Techniques

Future Internet, MDPI AG, 2010

Journal
2010

Shahzad Syed K., Granitzer Michael

Ontological Framework Driven GUI Development

Proceedings of I-KNOW, 2010

Konferenz
The user experience of any software or website consists of elements from theconceptual to the concrete level. These elements of user experience assist in the design anddevelopment of user interfaces. On the other hand, Ontologies provide a framework forcomputable representation of user interface elements and underlying data. This paper discussesstrategies of introducing ontologies at different user interface layers adapted from userexperience elements. These layers range from abstract levels (e.g. User needs/ApplicationObjectives) to concrete levels (e.g. Application User Interface) in term of data representation.The proposed ontological framework enables device independent, semi-automated GUIconstruction which we will demonstrate at a personal information management example.
2010

Granitzer Michael, Kienreich Wolfgang, Sabol Vedran, Lex Elisabeth

Knowledge Relationship Discovery and Visually Enhanced Access for the Media Domain

Medien-Wissen-Bildung. Explorationen visualisierter und kollaborativer Wissensräume, Innsbruck University Press, 2010

Konferenz
Technological advances and paradigmatic changes in the utilization of the World Wide Web havetransformed the information seeking strategies of media consumers and invalidated traditionalbusiness models of media providers. We discuss relevant aspects of this development and presenta knowledge relationship discovery pipeline to address the requirements of media providers andmedia consumers. We also propose visually enhanced access methods to bridge the gap betweencomplex media services and the information needs of the general public. We conclude that acombination of advanced processing methods and visualizations will enable media providers totake the step from content-centered to service-centered business models and, at the same time,will help media consumers to better satisfy their personal information needs.
2010

Kern Roman, Seifert Christin, Granitzer Michael

A Hybrid System for German Encyclopedia Alignment

International Journal on Digital Libraries, Springer, 2010

Journal
Collaboratively created on-line encyclopediashave become increasingly popular. Especially in terms ofcompleteness they have begun to surpass their printedcounterparts. Two German publishers of traditional encyclopediashave reacted to this challenge and started aninitiative to merge their corpora to create a single, more completeencyclopedia. The crucial step in this merging processis the alignment of articles. We have developed a two-stephybrid system to provide high-accurate alignments with lowmanual effort. First, we apply an information retrieval based,automatic alignment algorithm. Second, the articles with alow confidence score are revised using a manual alignmentscheme carefully designed for quality assurance. Our evaluationshows that a combination of weighting and rankingtechniques utilizing different facets of the encyclopedia articlesallow to effectively reduce the number of necessary manualalignments. Further, the setup of the manual alignment turned out to be robust against inter-indexer inconsistencies.As a result, the developed system empowered us to align fourencyclopedias with high accuracy and low effort.
2010

Granitzer Michael, Sabol Vedran, Kienreich Wolfgang, Lukose Dickson, Onn Kow Weng

Visual Analyses on Linked Data - An Opportunity for both Fields

The 2011 STI Semantic Summit, Riga, Latvia, 2010

2010

Seifert C., Granitzer Michael

User-based active learning

International Conference on Data Mining Workshops (Workshop on Visual Analytics and Knowledge Discovery), Fan, W., Hsu, W.,Webb, G. I., Liu, B., Zhang, C., Gunopulos, D., Wu, X., IEEE, 2010

Konferenz
Active learning has been proven a reliable strategyto reduce manual efforts in training data labeling. Suchstrategies incorporate the user as oracle: the classifier selectsthe most appropriate example and the user provides the label.While this approach is tailored towards the classifier, moreintelligent input from the user may be beneficial. For instance,given only one example at a time users are hardly ableto determine whether this example is an outlier or not. Inthis paper we propose user-based visually-supported activelearning strategies that allow the user to do both, selectingand labeling examples given a trained classifier. While labelingis straightforward, selection takes place using a interactivevisualization of the classifier’s a-posteriori output probabilities.By simulating different user selection strategies we show,that user-based active learning outperforms uncertainty basedsampling methods and yields a more robust approach ondifferent data sets. The obtained results point towards thepotential of combining active learning strategies with resultsfrom the field of information visualization.
2010

Lex Elisabeth, Khan I., Bischof H., Granitzer Michael

Assessing the Quality of Web Content

Proceedings of the ECML/PKDD Discovery Challenge 2010, Online, 2010

Konferenz
2010

Sabol Vedran, Granitzer Michael, Muhr M.

Scalable Recursive Top-Down Hierarchical Clustering Approach with implicit Model Selection for Textual Data Sets

IEEE Computer Society: 7th International Workshop on Text-based Information Retrieval in Procceedings of 21th International Conference on Database and Expert Systems Applications (DEXA 10)., IEEE, 2010

Konferenz
Automatic generation of taxonomies can be usefulfor a wide area of applications. In our application scenario atopical hierarchy should be constructed reasonably fast froma large document collection to aid browsing of the data set.The hierarchy should also be used by the InfoSky projectionalgorithm to create an information landscape visualizationsuitable for explorative navigation of the data. We developedan algorithm that applies a scalable, recursive, top-downclustering approach to generate a dynamic concept hierarchy.The algorithm recursively applies a workflow consisting ofpreprocessing, clustering, cluster labeling and projection into2D space. Besides presenting and discussing the benefits ofcombining hierarchy browsing with visual exploration, we alsoinvestigate the clustering results achieved on a real world dataset.
2010

Kern Roman, Zechner Mario, Granitzer Michael, Muhr M.

External and Intrinsic Plagiarism Detection using a Cross-Lingual Retrieval and Segmentation System Lab Report for PAN at CLEF 2010

2nd International Competition on Plagiarism Detection, 2010

Konferenz
We present our hybrid system for the PAN challenge at CLEF 2010.Our system performs plagiarism detection for translated and non-translated externallyas well as intrinsically plagiarized document passages. Our external plagiarismdetection approach is formulated as an information retrieval problem, usingheuristic post processing to arrive at the final detection results. For the retrievalstep, source documents are split into overlapping blocks which are indexed via aLucene instance. Suspicious documents are similarly split into consecutive overlappingboolean queries which are performed on the Lucene index to retrieve aninitial set of potentially plagiarized passages. For performance reasons queriesmight get rejected via a heuristic before actually being executed. Candidate hitsgathered via the retrieval step are further post-processed by performing sequenceanalysis on the passages retrieved from the index with respect to the passagesused for querying the index. By applying several merge heuristics bigger blocksare formed from matching sequences. German and Spanish source documentsare first translated using word alignment on the Europarl corpus before enteringthe above detection process. For each word in a translated document severaltranslations are produced. Intrinsic plagiarism detection is done by finding majorchanges in style measured via word suffixes after the documents have been partitionedby an linear text segmentation algorithm. Our approach lead us to the thirdoverall rank with an overall score of 0.6948.
2010

Kern Roman, Granitzer Michael

German Encyclopedia Alignment Based on Information Retrieval Techniques

ECDL 2010: Research and Advanced Technology for Digital Libraries, 2010

Konferenz
Collaboratively created online encyclopedias have becomeincreasingly popular. Especially in terms of completeness they have begunto surpass their printed counterparts. Two German publishers oftraditional encyclopedias have reacted to this challenge and decided tomerge their corpora to create a single more complete encyclopedia. Thecrucial step in this merge process is the alignment of articles. We havedeveloped a system to identify corresponding entries from different encyclopediccorpora. The base of our system is the alignment algorithmwhich incorporates various techniques developed in the field of informationretrieval. We have evaluated the system on four real-world encyclopediaswith a ground truth provided by domain experts. A combinationof weighting and ranking techniques has been found to deliver a satisfyingperformance.
2010

Kern Roman, Granitzer Michael, Muhr M.

KCDC: Word Sense Induction by Using Grammatical Dependencies and Sentence Phrase Structure

Proceedings of SemEval-2, 2010

Konferenz
Word sense induction and discrimination(WSID) identifies the senses of an ambiguousword and assigns instances of thisword to one of these senses. We have builda WSID system that exploits syntactic andsemantic features based on the results ofa natural language parser component. Toachieve high robustness and good generalizationcapabilities, we designed our systemto work on a restricted, but grammaticallyrich set of features. Based on theresults of the evaluations our system providesa promising performance and robustness.
2010

Sabol Vedran, Granitzer Michael, Seifert C.

Classifier Hypothesis Generation Using Visual Analysis Methods

NDT: Networked Digital Technologies, Springer, 2010

Konferenz
Classifiers can be used to automatically dispatch the abundanceof newly created documents to recipients interested in particulartopics. Identification of adequate training examples is essential forclassification performance, but it may prove to be a challenging task inlarge document repositories. We propose a classifier hypothesis generationmethod relying on automated analysis and information visualisation.In our approach visualisations are used to explore the document sets andto inspect the results of machine learning methods, allowing the user toassess the classifier performance and adapt the classifier by graduallyrefining the training set.
2010

Lirk G., Granitzer Michael, Söhnnichsen A., Kulczycki P.

Wissensmanagement in EBM

EbM - ein Gewinn für die Arzt-Patient-Beziehung?. Forum Medizin 21 der Paracelsus Medizinischen Privatuniversität & 11. EbM-Jahrestagung des Deutschen Netzwerks Evidenzbasierte Medizin, German Medical Science GMS Publishing House, 2010

Konferenz
2010

Kern Roman, Granitzer Michael, Muhr M.

Analysis of Structural Relationships for Hierarchical Cluster Labeling

Proceeding of the 33rd international ACM SIGIR Conference on Research and Development in information Retrieval, ACM, 2010

Konferenz
Cluster label quality is crucial for browsing topic hierarchiesobtained via document clustering. Intuitively, the hierarchicalstructure should influence the labeling accuracy. However,most labeling algorithms ignore such structural propertiesand therefore, the impact of hierarchical structureson the labeling accuracy is yet unclear. In our work weintegrate hierarchical information, i.e. sibling and parentchildrelations, in the cluster labeling process. We adaptstandard labeling approaches, namely Maximum Term Frequency,Jensen-Shannon Divergence, χ2 Test, and InformationGain, to take use of those relationships and evaluatetheir impact on 4 different datasets, namely the Open DirectoryProject, Wikipedia, TREC Ohsumed and the CLEFIP European Patent dataset. We show, that hierarchicalrelationships can be exploited to increase labeling accuracyespecially on high-level nodes.
2010

Lex Elisabeth, Granitzer Michael, Juffinger A.

A Comparison of Stylometric and Lexical Features for Web Genre Classification and Emotion Classification in Blogs

IEEE Computer Society: 7th International Workshop on Text-based Information Retrieval in Procceedings of 21th International Conference on Database and Expert Systems Applications (DEXA 10)., IEEE, 2010

Konferenz
In the blogosphere, the amount of digital content is expanding and for search engines, new challenges have been imposed. Due to the changing information need, automatic methods are needed to support blog search users to filter information by different facets. In our work, we aim to support blog search with genre and facet information. Since we focus on the news genre, our approach is to classify blogs into news versus rest. Also, we assess the emotionality facet in news related blogs to enable users to identify people’s feelings towards specific events. Our approach is to evaluate the performance of text classifiers with lexical and stylometric features to determine the best performing combination for our tasks. Our experiments on a subset of the TREC Blogs08 dataset reveal that classifiers trained on lexical features perform consistently better than classifiers trained on the best stylometric features.
2010

Lex Elisabeth, Granitzer Michael, Juffinger A.

Objectivity Classification in Online Media

21st ACM SIGWEB Conference on Hypertext and Hypermedia (HT2010), ACM, 2010

Konferenz
In this work, we assess objectivity in online news media. Wepropose to use topic independent features and we show ina cross-domain experiment that with standard bag-of-wordmodels, classifiers implicitly learn topics. Our experimentsrevealed that our methodology can be applied across differenttopics with consistent classification performance.
2010

Lex Elisabeth, Granitzer Michael, Juffinger A., Muhr M.

Stylometric Features for Emotion Level Classification in News Related Blogs

Proceedings of the 9th ACM RIAO Conference , LE CENTRE DE HAUTES ETUDES INTERNATIONALES D'INFORMATIQUE DOCUMENTAIRE, 2010

Konferenz
Breaking news and events are often posted in the blogospherebefore they are published by any media agency. Therefore,the blogosphere is a valuable resource for news-relatedblog analysis. However, it is crucial to first sort out newsunrelatedcontent like personal diaries or advertising blogs.Besides, there are different levels of emotionality or involvementwhich bias the news information to a certain extent.In our work, we evaluate topic-independent stylometric featuresto classify blogs into news versus rest and to assess theemotionality in these blogs. We apply several text classifiersto determine the best performing combination of featuresand algorithms. Our experiments revealed that with simplestyle features, blogs can be classified into news versus restand their emotionality can be assessed with accuracy valuesof almost 80%.
2010

Lex Elisabeth, Granitzer Michael, Juffinger A., Seifert C.

Efficient Cross-Domain Classification of Weblogs

International Journal of Intelligent Computing Research (IJICR), Vol.1, Issue 2, Infonomics Society, 2010

Journal
Text classification is one of the core applicationsin data mining due to the huge amount ofuncategorized textual data available. Training a textclassifier results in a classification model that reflectsthe characteristics of the domain it was learned on.However, if no training data is available, labeled datafrom a related but different domain might be exploitedto perform cross-domain classification. In our work,we aim to accurately classify unlabeled weblogs intocommonly agreed upon newspaper categories usinglabeled data from the news domain. The labeled newsand the unlabeled blog corpus are highly dynamicand hourly growing with a topic drift, so theclassification needs to be efficient. Our approach is toapply a fast novel centroid-based text classificationalgorithm, the Class-Feature-Centroid Classifier(CFC), to perform efficient cross-domainclassification. Experiments showed that thisalgorithm achieves a comparable accuracy thank-Nearest Neighbour (k-NN) and Support VectorMachines (SVM), yet at linear time cost for trainingand classification. We investigate the classifierperformance and generalization ability using aspecial visualization of classifiers. The benefit of ourapproach is that the linear time complexity enables usto efficiently generate an accurate classifier,reflecting the topic drift, several times per day on ahuge dataset.
2010

Lex Elisabeth, Granitzer Michael, Juffinger A.

Facet Classification of Blogs: Know-Center at the TREC 2009 Blog Distillation Task

Proceedings of the 18th Text REtrieval Conference, 2010

Konferenz
In this paper, we outline our experiments carried out at the TREC 2009 Blog Distillation Task. Our system is based on a plain text index extracted from the XML feeds of the TREC Blogs08 dataset. This index was used to retrieve candidate blogs for the given topics. The resulting blogs were classified using a Support Vector Machine that was trained on a manually labelled subset of the TREC Blogs08 dataset. Our experiments included three runs on different features: firstly on nouns, secondly on stylometric properties, and thirdly on punctuation statistics. The facet identification based on our approach was successful, although a significant number of candidate blogs were not retrieved at all.
2010

Klieber Hans-Werner, Granitzer Michael, Gaisbauer M.

Semantically enhanced Software Documentation Processes

Serdica Journal of Computing, 2010

Journal
High-quality software documentation is a substantial issue forunderstanding software systems. Shorter time-to-market software cycles increasethe importance of automatism for keeping the documentation up todate. In this paper, we describe the automatic support of the software documentationprocess using semantic technologies. We introduce a softwaredocumentation ontology as an underlying knowledge base. The defined ontologyis populated automatically by analysing source code, software documentationand code execution. Through selected results we demonstratethat the use of such semantic systems can support software documentationprocesses efficiently.
2010

Granitzer Michael

Enterprise Search

Lexikon der Bibliotheks- und Informationswissenschaft, Umlauf, K., Gradmann, S. , Anton Hiersemann Verlag, 2010

Buch
2010

Granitzer Michael, Kienreich Wolfgang

Semantische Technologien: Stand der Forschung und Visionen

Internationales Rechtsinformatik Symposion (IRIS 10), OCG, 2010

Konferenz
2010

Granitzer Michael

Adaptive Term Weighting through Stochastic Optimization

11th International Conference, CICLing 2010, Iasi, Romania, March 22-25, 2010, Gelbukh, A., Springer, 2010

Konferenz
Term weighting strongly influences the performance of text miningand information retrieval approaches. Usually term weights are determined throughstatistical estimates based on static weighting schemes. Such static approacheslack the capability to generalize to different domains and different data sets. Inthis paper, we introduce an on-line learning method for adapting term weightsin a supervised manner. Via stochastic optimization we determine a linear transformationof the term space to approximate expected similarity values amongdocuments. We evaluate our approach on 18 standard text data sets and showthat the performance improvement of a k-NN classifier ranges between 1% and12% by using adaptive term weighting as preprocessing step. Further, we provideempirical evidence that our approach is efficient to cope with larger problems
2010

Granitzer Michael, Lanthaler Markus, Gütl Christian

Semantic Web services: state of the art

Proceedings of the IADIS international conference-Internet technologies and society 2010, IADIS Press, 2010

Konferenz
Service-oriented architectures (SOA) built on Web services were a first attempt to streamline and automate business processes in order to increase productivity but the utopian promise of uniform service interface standards, metadata, and universal service registries, in the form of the SOAP, WSDL and UDDI standards has proven elusive. Furthermore, the RPC-oriented model of those traditional Web services is not Web-friendly. Thus more and more prominent Web service providers opted to expose their services based on the REST architectural style. Nevertheless there are still problems on formal describing, finding, and orchestrating RESTful services. While there are already a number of different approaches none so far has managed to break out of its academic confines. This paper focuses on an extensive survey comparing the existing state-of-the-art technologies for semantically annotated Web services as a first step towards a proposal designed specifically for RESTful services.
Kontakt Karriere

Hiermit erkläre ich ausdrücklich meine Einwilligung zum Einsatz und zur Speicherung von Cookies. Weiter Informationen finden sich unter Datenschutzerklärung

The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.

Close