clustering and assembling short reads for bioinformatics

 Efficient tool for clustering and assembling short reads,
 especially for RAD.
 Rainbow is developed to provide an ultra-fast and memory-efficient
 solution to clustering and assembling short reads produced by RAD-seq.
 First, Rainbow clusters reads using a spaced seed method. Then, Rainbow
 implements a heterozygote calling like strategy to divide potential
 groups into haplotypes in a top-down manner. long a guided tree, it
 iteratively merges sibling leaves in a bottom-up manner if they are
 similar enough. Here, the similarity is defined by comparing the 2nd
 reads of a RAD segment. This approach tries to collapse heterozygote
 while discriminate repetitive sequences. At last, Rainbow uses a greedy
 algorithm to locally assemble merged reads into contigs. Rainbow not
 only outputs the optimal but also suboptimal assembly results. Based on
 simulation and a real guppy RAD-seq data, it is shown that Rainbow is
 more competent than the other tools in dealing with RAD-seq data.