Generative AI in drug discovery

BioStrand

02.20.2024

Audio version

12:25

Generative AI is emerging as a strategic force in drug discovery, opening new possibilities across molecule generation, antibody design, de novo drug and vaccine development, and drug repurposing. As life sciences organizations work to accelerate innovation and reduce development costs, generative models offer a way to design more precise, effective, and personalized therapies. This blog explores how these technologies are being applied across the R&D pipeline, the deep learning techniques powering them, and the key challenges, like data quality, bias, and explainability—that must be addressed to fully realize their impact.

Generative AI in biopharma

Following a breakout year of rapid growth, generative AI has been widely, and justifiably, described as an undisputed game-changer for almost every industry. A recent McKinsey Global Survey lists the healthcare, pharma, and medical products sectors as one of the top regular users of generative AI. The report also highlights that organizations that have successfully maximized the value derived from their traditional AI capabilities tend to be more ardent adopters of generative AI tools.

The AI revolution in the life sciences industry continues at an accelerated pace, reflected partly in the increasing number of partnerships, mergers, and acquisitions centered around the transformative potential of AI. For the life sciences industry, therefore, generative AI represents the logical next step to transcend conventional model predictive AI methods and explore new horizons in computational drug discovery.

Here then, is a quick overview of generative AI and its potential and challenges vis-a-vis in silico drug discovery and development.

What is generative AI?

Where traditional AI systems make predictions based on large volumes of data, generative AI refers to a class of AI models that are capable of generating entirely new output based on a variety of inputs including text, images, audio, video, 3D models, and more.

Based solely on the input-output modality, generative AI models can be categorized as text to text (ChatGPT-4, Bard), to speech (Vertex AI), to video (Emu Video), to audio (Voicebox), to image (Adobe Firefly); image to text (Pix2Struct), to image (SinCode AI), to video (LeiaPix); video to video (Runway AI) and much more.

Currently, the most prominent types of generative AI models include Generative Adversarial Networks (GANs), Variational Autoencoders (VAEs), Recurrent Neural Networks (RNNs), diffusion models, flow-based models, autoregressive models, transformer-based Models, and style transfer models.

What is the role of generative AI in drug discovery?

It is estimated that generative AI technologies could yield as much as $110 billion a year in economic value for the life sciences industry. These technologies can play a transformative role across the drug discovery pipeline.

Generative AI can boost the precision, productivity, and efficiency of target identification and help accelerate the drug discovery process. These technologies will provide drug discovery teams with the capabilities to generate or design novel molecules with the desired properties and curate a set of drug candidates with the highest probability of success. This in turn would free up valuable R&D resources to focus on orphan, rare, and untreatable diseases.

These technologies will enable life sciences R&D to cope with the explosion in digital data, in diverse formats such as unstructured text, images, patient records, PDFs, and emails, and ingest and process multimodal data at scale. The ability to extract patterns from vast volumes of patient data can empower more personalized treatments and improved patient outcomes.

AI systems played an instrumental role in accelerating the development of an effective mRNA vaccine for COVID-19, the company put into place AI systems to accelerate the research process. Generative AI technologies are now being leveraged to address some of the challenges associated with designing RNA therapeutics and to design mRNA medicines with optimal safety and performance.

As with traditional AI systems, generative AI will help complement experimental drug discovery processes to further enhance the speed and accuracy of drug discovery and development while reducing the time and costs involved.

How do different generative models compare for molecule design?

Generative models like VAEs (Variational Autoencoders) and GANs (Generative Adversarial Networks) are increasingly applied to de novo drug design. VAEs are particularly effective for exploring latent chemical space, offering structured representations that capture chemical relationships. GANs, on the other hand, excel at generating structurally novel molecules, often producing higher diversity in candidate structures.

Combining both models in a generative pipeline helps balance molecular novelty with drug-like properties.

Model comparison

Model	Strengths	Weaknesses	Use Case
VAE	Explores latent space; captures structure–property relationships	Lower novelty	Scaffold hopping
GAN	High novelty; structurally diverse outputs	Training instability	De novo design
Combined Use	Balance between control and diversity	May increase complexity	Balanced candidate profiles

Why deep learning matters in generative drug discovery

Behind many of the advances in generative AI lies deep learning. It’s what allows these models to go beyond pattern recognition—to actually learn chemical behavior, understand biological targets, and propose entirely new drug candidates that make sense in context.

Deep learning models don’t just process data; they learn from it across multiple formats—molecular structures, protein sequences, even scientific text—and help connect the dots. That’s what makes them so powerful in applications like molecule generation, antibody design, and precision medicine.

By pairing deep learning with other tools—like AlphaFold2 or biomedical knowledge graphs—researchers can sharpen predictions, improve interpretability, and ultimately design better drug candidates, faster.

How is generative AI used for compound screening in drug discovery?

Pharma and biotech companies are increasingly turning to generative AI for in silico screening of novel compounds. These models are trained on molecular graph datasets (e.g., SMILES strings or 3D conformers) and validated using drug-likeness metrics like QED scores, docking simulations, and ADMET predictions.

To build a generative AI model for molecules, most researchers:

Use a curated SMILES-based dataset
Train a VAE or GAN on molecular representations
Validate outputs using metrics such as QED, synthesizability, and binding affinity predictions

These workflows can be combined with retrieval-augmented generation (RAG) pipelines to further refine candidate selection using up-to-date biomedical literature.

What are the key generative AI applications in drug discovery?

Overall, generative AI offers a transformative approach to drug discovery, significantly accelerating the identification and optimization of promising drug candidates while reducing costs and experimental uncertainty.

Molecule generation

Generative AI models represent a more efficient approach to navigating the vast chemical space and creating novel molecular structures with desired properties. Currently, a range of techniques, such as VAEs, GANs, RNNs, genetic algorithms, and reinforcement learning, are being used to generate molecules with desirable ADMET properties. One approach synergistically combines generative AI, predictive modeling, and reinforcement learning to generate valid molecules with desired properties. With their ability to simultaneously optimize multiple properties of a molecule, generative AI systems can help identify candidates with the most balanced profile in terms of efficacy, safety, and other pharmacological parameters.

Antibody design & development

The continuing evolution of artificial intelligence (AI), machine learning (ML), and deep learning (DL) techniques has helped significantly advance computational antibody discovery as a complement to traditional lab-based processes. The advent of protein language models (PLM), generative AI models trained on protein sequences, has the potential to unlock further innovations in in silico antibody design and development. Generative antibody design can significantly enhance the speed, quality, and efficiency of antibody design, help create more targeted and potent treatment modalities, and generate novel target-specific antibodies beyond the scope of conventional design techniques. Recent developments in this field have demonstrated the ability of zero-shot generative AI, models that do not use training data, to generate novel antibody designs that were tested and functionally validated in the wet lab without the need for any further optimization.

De novo drug design

The power of generative AI models is also being harnessed to create entirely new drug candidates by predicting molecular structures that interact favorably with biological targets. The increasing popularity of generative techniques has created a new approach to generative chemistry that has been successfully applied across atom-based, fragment-based, and reaction-based approaches for generating novel structures. Generative models have helped extend the capabilities of rule-based de novo molecule generation with recent research highlighting the potential of “rule-free” generative deep learning for de novo molecular design. The continuing evolution of generative AI towards multimodality will help further advance de novo design using complementary insights derived from diverse data modalities.

Drug repurposing

Generative AI can expedite the discovery of new uses for approved drugs, thereby circumventing the development time and costs associated with traditional drug discovery. One study demonstrated the power of generative AI technologies like ChatGPT modes to accelerate the review of existing scientific knowledge in an extensive Internet-based search space to prioritize drug repurposing candidates. New research also demonstrates how generative AI can rapidly model clinical trials to identify new uses for existing drugs and therapeutics. These technologies are already being applied successfully to the critical task of repurposing existing medicines for the treatment of rare diseases.

Precision drug discovery

By analyzing large-scale multimodal datasets, including multiomics data, genome-wide association studies (GWAS), disease-specific repositories, biobank-scale studies, patient data, genetic evidence, clinical data, imaging data, etc., generative AI models can help design drug candidates with the highest likelihood of efficacy and minimal side effects for specific patient populations.

What are the generative AI challenges in drug discovery?

Despite their immense potential, there are still several challenges that need to be addressed before generative AI technologies can be successfully integrated into drug discovery workflows.

Limited and noisy training data: Generative models require large, high-quality, diverse datasets for training. In drug discovery, experimental data is often sparse, and noisy, with errors and outliers. The availability of large volumes of high-quality data, especially for rare diseases or novel drug targets, remains a challenge.
Bias, generalizability, and ethical risks: Generative models trained on biased or limited datasets may produce biased or unrealistic outputs. It is therefore crucial to ensure that these models are trained on unbiased, diverse datasets and are generalized across the vast chemical space and biological targets. These technologies raise significant ethical and regulatory considerations, including concerns about patient safety, data privacy, and intellectual property rights.
Black-box models and lack of explainability: Finally, and most importantly, generative models are inherently a black box, raising further questions about interpretability and explainability.

These challenges notwithstanding, generative AI has the potential to usher in the next generation of AI-driven drug discovery.

Ready to explore how generative AI can support your drug discovery programs?

Talk to our team or explore more use cases in our platform.

Glossary tooltips

Tags: Drug discovery, In silico, Antibody discovery, Generative AI, De novo design

Subscribe to our Blog and get new articles right after publication into your inbox.

Related Blogs

Minimizing ADA risk with in silico immunogenicity screening

The rise of in silico epitope mapping: faster insights, near X-ray precision

eBook

Download the HYFTs^® — Connecting the Dots & Databases eBook to see how to solve the unique data challenges in biotherapeutics

Back to Blogs

Let’s accelerate change. Partner with us.

Powering
Biotherapeutic
Intelligence™

In Silico Discovery

Powering
Biotherapeutic
Intelligence™

Insight Hub

Powering
Biotherapeutic
Intelligence™

Company

News & Events

Generative AI in drug discovery

Audio version

Generative AI in biopharma

What is generative AI?

What is the role of generative AI in drug discovery?