What are you looking for ?
Advertise with us
PNY

R&D: Capacity of Secondary Structure Avoidance Codes for DNA Sequences

Authors prove that problem of constructing SSA sequences for any given secondary structure stem length m can be characterized by constrained system, and thus capacity of SSA sequences can be calculated by classic spectral radius approach in constrained coding theory.

IEEE Transactions on Molecular, Biological, and Multi-Scale has published an article written by Chen Wang; Hui Chu; Key Laboratory of Cryptologic Technology and Information Security, China, and School of Cyber Science and Technology, Shandong University, Qingdao, Shandong, China, Gennian Ge; School of Mathematical Sciences, Capital Normal University, Beijing, China, and Yiwei Zhang, Key Laboratory of Cryptologic Technology and Information Security, China, and School of Cyber Science and Technology, Shandong University, Qingdao, Shandong, China.

Abstract: “In DNA sequences, we have the celebrated Watson-Crick complement T=A, A=T, C=G, and G=C. The phenomenon of secondary structure refers to the tendency of a single stranded DNA sequence to fold back upon itself, which is usually caused by the existence of two non-overlapping reverse complement substrings. The property of secondary structure avoidance (SSA) forbids a sequence to contain such reverse complement substrings, and it is a key criterion in the design of single-stranded DNA sequences for both DNA storage and DNA computing. In this paper, we prove that the problem of constructing SSA sequences for any given secondary structure stem length m can be characterized by a constrained system, and thus the capacity of SSA sequences can be calculated by the classic spectral radius approach in constrained coding theory. We analyze how to choose the generating set, which is a subset of vertices in a de Bruijn graph, for the constrained system, which leads to some explicit constructions of SSA codes. In particular, our constructions have optimal rates 1.1679bits/nt and 1.5515bits/nt when m=2 and m=3, respectively. In addition, we combine the SSA constraint together with the homopolymer run-length-limit constraint and analyze the capacity of sequences satisfying both constraints.

Articles_bottom
ExaGrid
AIC
ATTOtarget="_blank"
OPEN-E
RAIDON