Publikationen

Hier finden Sie von Know-Center MitarbeiterInnen verfasste wissenschaftliche Publikationen

2018

Rexha Andi, Kröll Mark, Ziak Hermann, Kern Roman

Authorship Identification of Documents with High Content Similarity

Scientometrics, Wolfgang Glänzel, Springer Link, 2018

Journal
The goal of our work is inspired by the task of associating segments of text to their real authors. In this work, we focus on analyzing the way humans judge different writing styles. This analysis can help to better understand this process and to thus simulate/ mimic such behavior accordingly. Unlike the majority of the work done in this field (i.e., authorship attribution, plagiarism detection, etc.) which uses content features, we focus only on the stylometric, i.e. content-agnostic, characteristics of authors.Therefore, we conducted two pilot studies to determine, if humans can identify authorship among documents with high content similarity. The first was a quantitative experiment involving crowd-sourcing, while the second was a qualitative one executed by the authors of this paper.Both studies confirmed that this task is quite challenging.To gain a better understanding of how humans tackle such a problem, we conducted an exploratory data analysis on the results of the studies. In the first experiment, we compared the decisions against content features and stylometric features. While in the second, the evaluators described the process and the features on which their judgment was based. The findings of our detailed analysis could (i) help to improve algorithms such as automatic authorship attribution as well as plagiarism detection, (ii) assist forensic experts or linguists to create profiles of writers, (iii) support intelligence applications to analyze aggressive and threatening messages and (iv) help editor conformity by adhering to, for instance, journal specific writing style.
2017

Dragoni Mauro, Federici Marco, Rexha Andi

Extracting Aspects From User-generated Content For Supporting Opinion Mining Systems

Journal of Intelligent Information Systems, Kerschberg; Z. Ras, Springer, 2017

Journal
One of the most important opinion mining research directions falls in the extraction ofpolarities referring to specific entities (aspects) contained in the analyzed texts. The detectionof such aspects may be very critical especially when documents come from unknowndomains. Indeed, while in some contexts it is possible to train domain-specificmodels for improving the effectiveness of aspects extraction algorithms, in others themost suitable solution is to apply unsupervised techniques by making such algorithmsdomain-independent. Moreover, an emerging need is to exploit the results of aspectbasedanalysis for triggering actions based on these data. This led to the necessityof providing solutions supporting both an effective analysis of user-generated contentand an efficient and intuitive way of visualizing collected data. In this work, we implementedan opinion monitoring service implementing (i) a set of unsupervised strategiesfor aspect-based opinion mining together with (ii) a monitoring tool supporting usersin visualizing analyzed data. The aspect extraction strategies are based on the use of semanticresources for performing the extraction of aspects from texts. The effectivenessof the platform has been tested on benchmarks provided by the SemEval campaign and have been compared with the results obtained by domain-adapted techniques.

Rexha Andi, Dragoni Mauro, Federici Marco

An unsupervised aspect extraction strategy for monitoring real-time reviews stream

Information Processing and Management

Journal
One of the most important opinion mining research directions falls in the extraction of polarities referring to specific entities (aspects) contained in the analyzed texts. The detection of such aspects may be very critical especially when documents come from unknown domains. Indeed, while in some contexts it is possible to train domain-specific models for improving the effectiveness of aspects extraction algorithms, in others the most suitable solution is to apply unsupervised techniques by making such algorithms domain-independent. Moreover, an emerging need is to exploit the results of aspectbased analysis for triggering actions based on these data. This led to the necessity of providing solutions supporting both an effective analysis of user-generated content and an efficient and intuitive way of visualizing collected data. In this work, we implemented an opinion monitoring service implementing (i) a set of unsupervised strategies for aspect-based opinion mining together with (ii) a monitoring tool supporting users in visualizing analyzed data. The aspect extraction strategies are based on the use of semantic resources for performing the extraction of aspects from texts. The effectiveness of the platform has been tested on benchmarks provided by the SemEval campaign and have been compared with the results obtained by domain-adapted techniques.
Kontakt Karriere

Hiermit erkläre ich ausdrücklich meine Einwilligung zum Einsatz und zur Speicherung von Cookies. Weiter Informationen finden sich unter Datenschutzerklärung

The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.

Close