Binary package “sumaclust” in ubuntu disco

fast and exact clustering of genomic sequences

 With the development of next-generation sequencing, efficient tools are
 needed to handle millions of sequences in reasonable amounts of time.
 Sumaclust is a program developed by the LECA. Sumaclust aims to cluster
 sequences in a way that is fast and exact at the same time. This tool
 has been developed to be adapted to the type of data generated by DNA
 metabarcoding, i.e. entirely sequenced, short markers. Sumaclust
 clusters sequences using the same clustering algorithm as UCLUST and CD-
 HIT. This algorithm is mainly useful to detect the 'erroneous' sequences
 created during amplification and sequencing protocols, deriving from
 'true' sequences.