A three-year collaboration between Pfizer and the Research Center for Molecular Medicine of the Austrian Academy of Sciences (CeMM) has resulted in a new AI-driven drug discovery method that could make it faster and easier to identify small molecules with therapeutic potential.
In an article published April 25 in Science, CeMM researchers described how they created and scaled an AI and machine learning platform that measures how hundreds of small molecules bind to thousands of different human proteins, generating a catalog that can be used as the starting point for new drug development. The models used and all data from them are available for free to other researchers in the form of a web application, according to a press release from CeMM.
“We were amazed to see how artificial intelligence and machine learning can elevate our understanding of small-molecule behavior in human cells,” study lead Georg Winter, Ph.D., said in the release. “We hope that our catalog of small molecule-protein interactions and the associated artificial intelligence models can now provide a shortcut in drug discovery approaches.”
More than 90% of all marketed drugs are small molecules, or ligands, that work by binding to proteins. But researchers have identified ligands for only about 20% of human proteins, creating a blind spot that holds back not only drug development but also science’s understanding of medicine more broadly. Researchers have attempted to shed light on this area by creating new datasets, such as one published April 20 in Nature by scientists at Paris’ Institut Pasteur, and coming up with models like Pharos, a target identification tool created by researchers with the National Institutes of Health-sponsored Illuminating the Druggable Genome consortium.
CeMM and Pfizer’s approach specifically involves what’s known as “chemical proteomics,” where small molecule compounds called chemical probes are tested to see how and how well they bind to proteins of interest. They started by testing the way a library of 407 small molecule fragments interacts with human proteins, which led them to nearly 47,700 different protein-ligand interactions that involved more than 2,600 different proteins. Nearly 90% of those proteins previously had no known ligand, the researchers noted in the paper.
The team demonstrated the approach’s translational potential by using the results to develop synthetic ligands against several different targets. They also used the data to come up with a machine learning framework for building models that could predict how many different proteins a small molecule would bind to and how difficult those proteins were to access. In addition, the models can predict if the ligand interacts with whole subsets of protein types, such as RNA-binding or transporter proteins, or with proteins in certain locations in a cell.
Pfizer, which helped fund the research, was an early adopter of AI for drug development. The Big Pharma has been using this tech to monitor vaccine and medicine safety since 2014 and even used it to develop Paxlovid, its blockbuster antiviral for COVID-19.