sci-biology/cd-hit (gentoo)

Search

Package Information

Description:
CD-HIT is a very widely used program for clustering and comparing large sets of protein or nucleotide sequences. CD-HIT is very fast and can handle extremely large databases. CD-HIT helps to significantly reduce the computational and manual efforts in many sequence analysis tasks and aids in understanding the data structure and correct the bias within a dataset. The CD-HIT package has CD-HIT, CD-HIT-2D, CD-HIT-EST, CD-HIT-EST-2D, CD-HIT-454, CD-HIT-PARA, PSI-CD-HIT and over a dozen scripts. CD-HIT (CD-HIT-EST) clusters similar proteins (DNAs) into clusters that meet a user-defined similarity threshold. CD-HIT-2D (CD-HIT-EST-2D) compares 2 datasets and identifies the sequences in db2 that are similar to db1 above a threshold. CD-HIT-454 is a program to identify natural and artificial duplicates from pyrosequencing reads. The usage of other programs and scripts can be found in CD-HIT user's guide.
Homepage:
http://weizhong-lab.ucsd.edu/cd-hit/
License:
GPL-2

Versions

Version EAPI Keywords Slot
4.6.6-r1 8 ~amd64 ~x86 0

Metadata

Description

Maintainers

Upstream

Raw Metadata XML
<pkgmetadata>
	<maintainer type="project">
		<email>sci-biology@gentoo.org</email>
		<name>Gentoo Biology Project</name>
	</maintainer>
	<longdescription>
CD-HIT is a very widely used program for clustering and comparing large sets 
of protein or nucleotide sequences. CD-HIT is very fast and can handle 
extremely large databases. CD-HIT helps to significantly reduce the 
computational and manual efforts in many sequence analysis tasks and aids in 
understanding the data structure and correct the bias within a dataset.
The CD-HIT package has CD-HIT, CD-HIT-2D, CD-HIT-EST, CD-HIT-EST-2D, 
CD-HIT-454, CD-HIT-PARA, PSI-CD-HIT and over a dozen scripts. CD-HIT 
(CD-HIT-EST) clusters similar proteins (DNAs) into clusters that meet a 
user-defined similarity threshold. CD-HIT-2D (CD-HIT-EST-2D) compares 2 
datasets and identifies the sequences in db2 that are similar to db1 above 
a threshold. CD-HIT-454 is a program to identify natural and artificial 
duplicates from pyrosequencing reads. The usage of other programs and 
scripts can be found in CD-HIT user's guide.
	</longdescription>
	<upstream>
		<remote-id type="google-code">cdhit</remote-id>
		<remote-id type="github">weizhongli/cdhit</remote-id>
	</upstream>
</pkgmetadata>

Lint Warnings

USE Flags

Flag Description 4.6.6-r1
openmp Build support for the OpenMP (support parallel computing), requires >=sys-devel/gcc-4.2 built with USE="openmp"

Files

Manifest

Type File Size Versions
Unmatched Entries
Type File Size
DIST cd-hit-4.6.6.tar.gz 1152570 bytes