Publikationen

Hier finden Sie von Know-Center MitarbeiterInnen verfasste wissenschaftliche Publikationen

2018

Lovric Mario, Molero Perez Jose Manuel, Kern Roman

PySpark and RDKit: moving towards Big Data in QSAR

Molecular Informatics, Wiley, 2018

Journal
We present an implementation of the cheminformatics toolkit RDKit in a distributed computing environment, Apache Hadoop. Together with the Apache Spark analytics engine, wrapped in PySpark, resources from commodity scalable hardware can be used for cheminformatic calculations and query operations with basic knowledge in Python coding and understanding of the RDD abstraction. A comparison of the computing acceleration in the Hadoop cluster is presented in two computation tasks of querying substructures and calculating molecular descriptors, as well as the source code for the PySpark-RDKit implementation
Kontakt Karriere

Hiermit erkläre ich ausdrücklich meine Einwilligung zum Einsatz und zur Speicherung von Cookies. Weiter Informationen finden sich unter Datenschutzerklärung

The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.

Close