R&D: Efficient Explicit and Pseudo-Random Constructions of Constrained Codes for DNA Storage
Further generalize idea and establish systematic framework of pseudo-random construction for wide range of codes
This is a Press Release edited by StorageNewsletter.com on January 10, 2025 at 2:00 pmIEEE Transactions on Communications has published an article written by Xuan He; Yajuan Liu; Tong Wang; and Xiaohu Tang, Information Coding and Transmission Key Lab of Sichuan Province, Southwest Jiaotong University, China.
Abstract: “GC-content and homopolymers are two constraints widely considered in DNA storage. Extensive experiments show that DNA strings with too high/low GC-content and/or long homopolymers are more likely to suffer from errors. We say a length-n DNA string satisfies the (ϵ, ℓ)-constraints if no homopolymer exceeds ℓ and its GC-content falls between (0.5 – ϵ)n and (0.5 + ϵ)n, where ϵ is a positive real number and ℓ is a positive integer; say it satisfies the (δ, ℓ)-prefix-constraints if no homopolymer exceeds ℓ and the GC-content of its length-i prefix falls between i/2-δ and i/2+δ for any i ∈ {1, 2, …, n}, where δ is a positive integer. In this paper, we realize the first enumerative coding for the capacity-achieving (ϵ, ℓ)-constrained codes and (δ, ℓ)-prefix-constrained codes with polynomial encoding/decoding complexity. Meanwhile, we propose an efficient pseudo-random construction of capacity-approaching (ϵ, ℓ)-constrained codes and (ϵ, ℓ)-constrained error correction codes. We further generalize the idea and establish a systematic framework of pseudo-random construction for a wide range of codes.“