Machine learning guiding early-stage drug discovery process

Machine learning systems are helping researchers in Wisconsin discover promising drug candidates through more efficient data analysis.

Tony Gitter is an associate professor in the Department of Biostatistics and Medical Informatics at UW-Madison, and an investigator at the Morgridge Institute for Research. Speaking yesterday at the Wisconsin Biohealth Summit in Madison, he explained how his research team worked with other scientists at the university to identify new potential antibiotics.

“There’s a lot of interest and need to develop novel types of antibiotics because there’s increasing bacterial resistance to some of the classic antibiotics that we have on the market,” he said.

Scientists working under UW-Madison Professor James Keck in the Biomolecular Chemistry Department started with a focus on a bacterial strain of pneumonia, and determined that breaking apart two structural proteins would kill the bacteria. From there, they began looking for chemicals that could do so through the traditional screening process. After months of research, scientists and graduate students working at the Keck Lab had tested over 427,000 chemicals.

“Their reward is finding that 99.9 percent of what they tested completely failed. There’s a tiny, tiny sliver of chemicals that might be promising, but most of those are also not very good,” Gitter (pictured here) said. “But what this generates is a lot of data, so now we have an area that’s really ripe for machine learning to step in.”

Gitter’s research group trained machine learning models on the data in hopes of finding out what makes that small number of candidate chemicals different from all the others that failed.

After scoring a billion more combinations that were commercially available, the system came up with a list of just 68 chemicals “that looked very appealing,” he said. After acquiring and testing those candidates, they found that about half of them showed promise for killing the bacteria.

“So we go from 99.9 percent complete failures, to an almost 50 percent hit rate, because the machine learning system is guiding our decisions about what to test,” he said. “We can have a much more customized view of which chemicals might actually work.”

While artificial intelligence applications like this are having a large impact on the early stages of drug discovery, Gitter said future developments might improve late-stage efforts as well.

“Machine learning is not yet reducing our animal testing needs, it’s not yet reducing the number of failed clinical trials that we have,” he said. “Those are going to be some grand challenges that we might think about going forward.”

–By Alex Moe