Galaxy-compatible Tool for Rapid Aptamer Clustering and HT-SELEX Data Analysis

Nikita Aleksandrovich Skrylnik


Aim: Implementing deep sequencing for analysis of DNA aptamer selection results requires for specialized bioinformatic software. Analysis steps include search for homologous sequences, clustering, and comparing cluster enrichment across different samples. These procedures allow deeper characterization of selected sequences by target affinity, non-specific amplification, and off-target binding, thus highlighting most promising variants or motifs. Materials and Methods: Sequencing results of systematic evolution of ligands by exponential enrichment for 40 nucleotide aptamers against extracellular CD47 protein were used as datasets for comparative clustering. Modified fast clustering script was developed based on FASTAptamer-Cluster and adapted as a galaxy tool. The algorithm was modified to terminate calculations after achieving the threshold value, and an exceeding edit distance was then assigned to non-matching pair of sequences. Results and Discussion: We have developed a set of galaxy compatible applications for rapid clustering of sequencing results and further comparative analysis of clusters. Our clustering algorithm is specifically optimized for searching for highly homologous sequences that usually form aptamer clusters and provides an average 8.4-fold increase in speed. Conclusion: Our modified clustering algorithm substantially surpasses existing alternatives in speed, thus simplifying analysis of large data sets, while its Galaxy version allows easy integration in standard workflows for preprocessing and analysis of the deep sequencing results.

Full Text:




  • There are currently no refbacks.