R&D: Exploring Potential Biosafety Implications in DNA Information Storage
Study provides valuable insights into safe advancement of DNA storage technology and emphasizes need for comprehensive biosafety evaluation in synthetic biology applications.
This is a Press Release edited by StorageNewsletter.com on May 1, 2025 at 2:00 pmBiosafety and Health has published an article written by Shangzhe Li, Key Laboratory of Special Pathogens and Biosafety, Center for Biosafety Mega-Science, Wuhan Institute of Virology, Chinese Academy of Sciences, Wuhan 430071, China, and University of Chinese Academy of Sciences, Beijing 101408, China, Yue Shi, Jing Yang, Haizhou Liu, Lijia Jia, Key Laboratory of Special Pathogens and Biosafety, Center for Biosafety Mega-Science, Wuhan Institute of Virology, Chinese Academy of Sciences, Wuhan 430071, China, and Di Liu, Key Laboratory of Special Pathogens and Biosafety, Center for Biosafety Mega-Science, Wuhan Institute of Virology, Chinese Academy of Sciences, Wuhan 430071, China, and University of Chinese Academy of Sciences, Beijing 101408, China.
Abstract: “Deoxyribonucleic acid (DNA) information storage has emerged as a promising solution to address the challenges of traditional silicon-based data storage systems. However, the biosafety implications of artificially synthesized DNA sequences in this technology remain understudied. This research evaluates the biosafety risks associated with five representative DNA storage encoding methods [Church, Goldman, DNA Fountain, Grass, and movable-type (MT) encoding] by analyzing their sequence similarities to natural biological DNA. Through Kraken2 taxonomic classification and Basic Local Alignment Search Tool for nucleotides (BLASTn) alignment analysis, we found that while most artificially designed DNA sequences showed significant differences from known biological sequences, specific encoding methods produced sequences similar to natural genomes. The MT encoding method showed the highest annotation rate (4.59 %) in Kraken2 analysis, while Goldman and Fountain methods demonstrated significant local sequence alignments in BLASTn analysis. Sequence length positively correlated with annotation rates, suggesting longer sequences pose potentially higher biosafety risks. Furthermore, aligned sequences often exhibited characteristics of tandem repeats, particularly in non-coding regions. These findings highlight the importance of incorporating biosafety considerations in DNA storage encoding method development and suggest that randomization strategies may help mitigate potential risks. Our study provides valuable insights into the safe advancement of DNA storage technology and emphasizes the need for comprehensive biosafety evaluation in synthetic biology applications.“