R&D: ADRS-Cnet, Adaptive Models of Dimensionality Reduction Methods for DNA Storage Clustering Algorithms
Paper proposes model based on multilayer perceptron (MLP) that classifies DNA sequence features and intelligently selects optimal dimensionality reduction method, thereby enhancing subsequent clustering performance.
This is a Press Release edited by StorageNewsletter.com on January 23, 2025 at 2:33 pmArXiv has published an article written by Bowen Liu, University of Southampton Mathematical Sciences Southampton, United Kingdom, and Jiankun Li, University of Southampton Mathematical Sciences Shanghai, China.
Abstract: “DNA storage technology, with its high density, long-term preservation capability, low maintenance requirements, and compact physical size, is emerging as a promising option for large-scale data storage. However, extracting features from DNA sequences of varying lengths can lead to the problem of dimensionality, which needs to be addressed. Techniques such as PCA, UMAP, and t-SNE are commonly used to project high-dimensional data into a lower-dimensional space, but their effectiveness varies across different datasets. To address this challenge, this paper proposes a model based on a multilayer perceptron (MLP) that classifies DNA sequence features and intelligently selects the optimal dimensionality reduction method, thereby enhancing subsequent clustering performance. Experimental results, tested on open-source datasets and compared with multiple benchmark methods, demonstrate that our model not only excels in classification performance but also significantly improves clustering accuracy, indicating that this approach effectively mitigates the challenges posed by high-dimensional features in clustering models.“