Artificial intelligence (AI) has evolved from a futuristic vision to a mainstream technology, highlighted by the introduction of tools like OpenAI's ChatGPT and Google's Gemini. In recent years, AI has become increasingly integrated into the field of genomics. This integration has enabled new scientific discoveries while simultaneously raising important ethical questions1. Interviews with two researchers at the center of this intersection provide insightful perspectives into the potential and the challenges of combining AI with genomics.
Leveraging AI for Insights on Disease
Jian Hu, Ph.D., Assistant Professor of Human Genetics at Emory School of Medicine, leads a lab that specializes in integrating AI and genomics. This group develops new statistical and machine-learning methods for analyzing various types of biological data, including transcriptomics, proteomics, and metabolomics, as well as cell morphology from staining data. A key focus of Hu’s lab is applying these methods to understand disease mechanisms and improve diagnosis for cancer and Alzheimer's disease. "By identifying these [disease] mechanisms, we can find some potential targets for precision medicine,” Hu stated. “We want to develop new medicine and give guidance on the clinical treatment of these diseases.”
To further understand these diseases, the lab utilizes a multidisciplinary approach that often incorporates AI. According to Hu, AI has three main components: datasets, hardware, and algorithms. His lab focuses on the latter, developing models to convert data into knowledge without generating new data themselves. While Hu acknowledged the advantages of AI models, particularly in handling high-dimensional and complex data, Hu emphasized that their lab is open to employing a variety of methods. He noted that their research is primarily driven by biological questions and they choose the most appropriate tools for the task at hand. This includes the use of convolutional and recurrent neural networks to analyze complex biological data like tumor regions in tissue sections.
Hu further highlighted the capabilities of AI in medical diagnostics, explaining that its power lies in its ability to mimic and surpass expert tasks. For example, a pathologist may be able to tell you whether a tumor tissue is aggressive or not; however, they may find it difficult to articulate the rationale behind their judgment and would be overwhelmed with extensive datasets. In contrast, AI models are proficient at analyzing these medical images while also integrating diverse data types such as RNA and protein measurements.
Discussing their impactful research, Hu presented their efforts in analyzing spatial transcriptomics data from cancer tissues. This led to the development of TELSA (Tumor Edge Structure and Lymphocyte multi-level Annotation), a machine learning framework for high-resolution tissue annotation in histology images. This innovative tool enables pixel-level joint tissue segmentation and annotation, detects tumor-immune microenvironments, and differentiates various lymphoid structures based on gene markers. After evaluating seven cancer datasets, TELSA demonstrated notable performance in generating super-resolution gene expression images2.
Another tool from Hu’s lab is in development and links changes in cell morphology observed in tumor tissues to underlying molecular or proteomic changes. Hu explained that this AI-driven approach integrates multimodal data to provide a comprehensive view of the tumor microenvironment by combining data from cellular imagery with molecular biology.
These tools demonstrate the innovative uses of AI to integrate various data types and provide improved biological insights. Additional details about Hu’s work in AI can be found on their lab website.
Ethical and Regulatory Considerations
Harry Farmer, Senior Researcher at the Ada Lovelace Institute, believes that AI and genomics are anticipated to be among the most impactful technologies of this century. He noted that the fields of AI and genomics have been instrumental in addressing important scientific challenges such as the use of next-generation sequencing to discover COVID-19 variants and the application of AI and machine learning in determining protein structures. However, Farmer noted that both of these fields have caused controversies over the risks they pose, including ethical concerns in genetic engineering and AI's potential societal harms. While their convergence is promising in the medical field, it poses significant regulatory challenges.
Farmer works on a collaboration between the Ada Lovelace Institute and the Nuffield Council on Bioethics that explores the ethical issues between AI and genomics. In particular, the group is focused on AI-driven advances in genomics for enhancing predictions about patient health and drug responses. The initiative examines the societal and healthcare implications in the UK through diverse methodologies like public engagement and futures thinking to investigate these capabilities known as AI powered genomic health prediction (AIGHP).
The initial findings have shown significant growth in AI-powered genomics, which has been driven by machine learning and deep learning, with substantial private investment in areas like data collection, drug discovery, and precision medicine. The research also highlighted existing and emerging trends in protein research, drug development, and the prediction of phenotypic traits from genomic data.
Farmer explained that the particular blend of growing themes and capabilities in AI-powered genomics signals the rising practicality of two extensive strategies in healthcare for the next five to ten years:
- AI-powered genomic health personalization: the capacity to identify and comprehend how treating the same medical issue might change from one individual to another based on genetic differences, and to tailor treatments to these unique genetic profiles.
- AI-powered genomic health prediction: using genetic information to predict the likelihood of individuals developing certain health conditions, experiencing effective or adverse responses to certain treatments or drugs, or being impacted by lifestyle behaviors.
In the most recent phase of the project, the focus shifted to these challenges and how AIGHP could transform healthcare in the UK. Using various methods to identify potential risks associated with AIGHP, they found that there is a need to evaluate and possibly update the UK's legal, regulatory, and governance frameworks to manage these risks effectively. Some of the areas they are currently exploring include data protection, medical device regulation, and the use of genomic data by the health insurance industry, as well as determining how AIGHP could be integrated into the healthcare system in ways that minimize risks and involve relevant stakeholders. The group’s full discussions and results will be available and published this Spring.
Looking Ahead
The work of these two researchers underscores the vast potential and significant challenges presented by the integration of AI and genomics. On the research side, labs like Hu’s are demonstrating their application can advance our understanding of disease with cutting-edge tools like TESLA and ongoing projects that explore the tumor microenvironment. Meanwhile, Farmer's work highlights the thin line researchers and policymakers will have to walk to ensure these technologies are employed safely. The future of healthcare and personalized medicine is dependent on our ability to develop these tools while ensuring that advancements in AI and genomics are utilized responsibly and equitably.
References
- Sen, S. K., Green, E. D., Hutter, C. M., Craven, M., Ideker, T., & Di Francesco, V. (2024). Opportunities for basic, clinical, and bioethics research at the intersection of machine learning and genomics. Cell genomics, 4(1), 100466. https://doi.org/10.1016/j.xgen.2023.100466
- Hu, J., Coleman, K., Zhang, D., Lee, E. B., Kadara, H., Wang, L., & Li, M. (2023). Deciphering tumor ecosystems at super resolution from spatial transcriptomics with TESLA. Cell systems, 14(5), 404–417.e4. https://doi.org/10.1016/j.cels.2023.03.008