The exa-scale ready tool: LiGen™
Since 2010, Dompé SpA, , has invested in a proprietary software for computer aided drug design (CADD), through its dedicated Drug Discovery Platform. The most relevant tool is the de novo structure-based virtual screening software LiGen™ (Ligand Generator), co-designed in collaboration with the Italian super computer center, CINECA. The distinguishing feature of LiGen™ is that it has been designed and developed to run on High Performance Computing (HPC) architectures. To maintain the performance primate beyond 2020, Dompé has decided to take the opportunity to embed in LiGen™ the emerging and most innovative technologies and new programming paradigms, which are leading the transition to the exa-scale computing era. In the ANTAREX framework, a LiGen™ implementation, ready for exascale HPC application, was obtained, reaching a 100X of performance increase due to the redesign of part of the code that was co-developed with the Politecnico di Milano, and a 100X in the scale of the simulation thanks to the new optimized implementation by CINECA.
Tangible Chemical Space
Dompé has generated a huge virtual chemical space of hundreds of billions of compounds. This library was built starting from a database of millions of available commercial reagents that were combined using a set of robust synthetic reactions, in order to obtain a tangible chemical space, meaning that this is truly achievable in one reaction step. The reactions are encoded by means of a smart language that recounts detailed information about the reagent substructures directly involved in the reaction and their chemical environment, enabling an accurate annotation of each reaction in terms of synthetic feasibility. Dompé has already generated 500 Billion compounds and, more importantly, has developed a technology able to obtain trillions of compounds before the end of 2020. These numbers represent a huge improvement, by about 1,000-fold, compared to the ultra-large virtual docking library launched by NIH (National Institute of Mental Health) researchers. (link: https://www.sciencedaily.com/releases/2019/02/190206131924.htm) In the context of HPC accelerated drug design, having access to a huge virtual chemical space represents a great advantage in virtual screening applications to increase the finding of potentially therapeutic molecules, especially if they are easily synthesizable. A further advantage of starting from billions of virtual compounds is the possibility to obtain equally large but very focused chemical spaces by applying criteria defined by specific project needs, like precise physical-chemical and ADMET properties, thus speeding up the next steps of drug development.
Dompé Ultra High Performance Virtual Screening Platform
The huge tangible virtual chemical space developed by Dompé can be exploited in exascale virtual screening applications, which integrate perfectly with the Antarex enhanced LiGen tool. In this way, Dompé has developed an Ultra High Performance Virtual Screening Platform (EXSCALATE) based on both LiGen, an exascale software able to screen billions of compounds in a very short time, and a library of trillions of compounds. This HPC architecture exploits the CINECA (and PRACE) super computing infrastructure. Dompé will make available to the EU (and National) Healthcare Systems the exascale virtual screening platform providing the fastest response for virus infections and multidrug-resistant bacteria. Exascale computing is also needed in order to rethink the actual drug discovery and development process for addressing diseases, for which the one target-one drug model is insufficient to account for both the disease complexity and the patient’s peculiarity. The key to the success of the structure-based drug design, performed with Exscalate, is the evaluation of a chemical space big enough to enable the identification of chemical structures having the best complementary pattern of interactions with the biological target under investigation, along with other phys-chem characteristics as well as novelty and synthesis feasibility. Thanks to this HPC architecture, in December 2018, Dompé was able to run the first reported exascale-ready structure-based virtual screening experiment: 1.2 billion molecules against 1 biological target using more than 900K hardware threads (> 220K physical cores) on the MARCONI Tier-0 system at CINECA. The simulation requires only 3 hours. Soon, EXSCALATE will be able to count on the new Italian supercomputer, called LEONARDO that will be hosted at CINECA. This impressive machine will be able to reach a power peak of 270 petaflops, which will make the process of screening and identifying new drugs even faster.