Bringing data science and AI/ML tools to infectious disease research
H3D Foundation and Ersilia present
Session 4: Generative models
Breakout Session 4
Gemma Turon, @TuronGemma, firstname.lastname@example.org
Miquel Duran-Frigola, @mduranfrigola, email@example.com
Ersilia Open Source Initiative, @ersiliaio, https://ersilia.io
30th September 2022
I have a list of selected hits, but I would like to improve some of the molecules with not so great activity and diversify the collection
Similarity search is a type of generative model based on searching for similar molecules among an already generated virtual library.
In this case, the generative step has already been done and we only need to filter out molecules
- Much faster
- Potentially less diversity
We will look for alternatives to one of the molecules from the MMV Malaria Box we used in session 2.
Select a molecule of interest from the MMV Malaria Box Dataset (session2/data)
Predicted activity against malaria:
Score = 4
Similarity Models in the EMH
We will work with two similarity models:
166.4 billion possible molecules of up to 17 atoms
Interactive browsing at: http://faerun.gdb.tools/
10 million possible molecules curated from GDBChEMBL
Download and browse: http://gdb.unibe.ch
Both models provide the 100 closest molecules
Model Fetch and Predict
#running as python package from ersilia import ErsiliaModel model = ErsiliaModel("eos4b8j") model.serve() output_eos48bj = model.predict(input="Cc1ccc(Nc2nc(NCCO)c3ccccc3n2)cc1C", output="pandas") model.close() from ersilia import ErsiliaModel model = ErsiliaModel("eos7jlv") model.serve() output_eos7jlv = model.predict(input="Cc1ccc(Nc2nc(NCCO)c3ccccc3n2)cc1C", output="pandas") model.close()
Find the 100 compounds from each database that are most similar to your selected molecule:
- Are the hits obtained from each database different?
- Are hits from GDBMedChem synthetically more accessible than hits from GDBChEMBL?
- Do we have any molecule with predicted higher antimalarial potential than the original hit?
- Which molecules would you select for further screening?
- Is there any ADMET consideration you are taking into account for the selection, after what we reviewed on session 3?
By Gemma Turon