Publikationen

Hier finden Sie von Know-Center MitarbeiterInnen verfasste wissenschaftliche Publikationen

2009

Neidhart T., Granitzer Michael, Kern Roman, Weichselbraun A., Wohlgenannt G., Scharl A., Juffinger A.

Distributed Web2.0 Crawling for Ontology Evolution

Journal of Digital Information Management, 2009

Journal
2009

Lex Elisabeth, Juffinger A.

Crosslanguage Blog Mining and Trend Visualisation

Proceedings of the 18th World Wide Web Conference, 2009

Konferenz
People use weblogs to express thoughts, present ideas and share knowledge, therefore weblogs are extraordinarily valuable resources, amongs others, for trend analysis. Trends are derived from the chronological sequence of blog post count per topic. The comparison with a reference corpus allows qualitative statements over identified trends. We propose a crosslanguage blog mining and trend visualisation system to analyse blogs across languages and topics. The trend visualisation facilitates the identification of trends and the comparison with the reference news article corpus. To prove the correctness of our system we computed the correlation between trends in blogs and news articles for a subset of blogs and topics. The evaluation corroborated our hypothesis of a high correlation coefficient for these subsets and therefore the correctness of our system for different languages and topics is proven.
2009

Granitzer Michael, Lex Elisabeth, Juffinger A.

Blog Credibility Ranking by Exploiting Verified Content

Proceedings of the 3rd Workshop on Information Credibility on the Web at 18th World Wide Web Conference, 2009

Konferenz
People use weblogs to express thoughts, present ideas and share knowledge. However, weblogs can also be misused to influence and manipulate the readers. Therefore the credibility of a blog has to be validated before the available information is used for analysis. The credibility of a blogentry is derived from the content, the credibility of the author or blog itself, respectively, and the external references or trackbacks. In this work we introduce an additional dimension to assess the credibility, namely the quantity structure. For our blog analysis system we derive the credibility therefore from two dimensions. Firstly, the quantity structure of a set of blogs and a reference corpus is compared and secondly, we analyse each separate blog content and examine the similarity with a verified news corpus. From the content similarity values we derive a ranking function. Our evaluation showed that one can sort out incredible blogs by quantity structure without deeper analysis. Besides, the content based ranking function sorts the blogs by credibility with high accuracy. Our blog analysis system is therefore capable of providing credibility levels per blog.
2009

Lex Elisabeth, Granitzer Michael, Juffinger A., Seifert C.

Cross-Domain Classification: Trade-Off between Complexity and Accuracy

Proceedings of the 4th International Conference for Internet Technology and Secured Transactions (ICITST) 2009, 2009

Text classification is one of the core applications in data mining due to the huge amount of not categorized digital data available. Training a text classifier generates a model that reflects the characteristics of the domain. However, if no training data is available, labeled data from a related but different domain might be exploited to perform crossdomain classification. In our work, we aim to accurately classify unlabeled blogs into commonly agreed newspaper categories using labeled data from the news domain. The labeled news and the unlabeled blog corpus are highly dynamic and hourly growing with a topic drift, so a trade-off between accuracy and performance is required. Our approach is to apply a fast novel centroid-based algorithm, the Class-Feature-Centroid Classifier (CFC), to perform efficient cross-domain classification. Experiments showed that this algorithm achieves a comparable accuracy than k-NN and is slightly better than Support Vector Machines (SVM), yet at linear time cost for training and classification. The benefit of this approach is that the linear time complexity enables us to efficiently generate an accurate classifier, reflecting the topic drift, several times per day on a huge dataset.
2009

Willfort R., Lex Elisabeth, Granitzer Michael, Juffinger A.

Spectral Web Content Trend Analysis

Proc. of IADIS International Conference WWW/Internet, 2009

Konferenz
2009

Lex Elisabeth, Granitzer Michael, Juffinger A., Seifert C.

Automated Blog Classification: A Cross Domain Approach

Proc. of IADIS International Conference WWW/Internet, 2009

Konferenz
2009

Kern Roman, Juffinger A., Granitzer Michael

Application of Axiomatic Approaches to Crosslanguage Retrieval

Working Notes for the CLEF 2009 Workshop, 2009

Konferenz
2009

Lex Elisabeth, Granitzer Michael, Juffinger A.

Know-Center at TREC 2009 Blog Distillation Task: A Notebook Paper

Notebook of TREC 2009, 2009

Konferenz
Kontakt Karriere

Hiermit erkläre ich ausdrücklich meine Einwilligung zum Einsatz und zur Speicherung von Cookies. Weiter Informationen finden sich unter Datenschutzerklärung

The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.

Close