Protein Clusters FAQ
Q1 What method is used to create protein clusters
The similarity of proteins is determined from the aggregated BLAST hits obtained by blastp with e-value 10E-3. Prokaryotic and virus protein clusters are generated with hierarchical method using modified blastp score (adjusted by length) as distance between two proteins. Eukaryotic protein clusters are cliques defined by symmetrical best hits.
Q2 how to find all clusters that contain proteins from Salmonella PLASMIDS
Search for ‘Salmonella[orgn]’ and use Limits page to limit by Nucleotide Source ‘Plasmid’
Q3 How to download all protein clusters
Use FTP to download cluster release (ftp://ftp.ncbi.nih.gov/genomes/CLUSTERS)
Q4 How frequently protein clusters are updated
Starting July 2013 protein cluster release will be coordinated with Refseq release
Q5 why I do not see my favorite organism in protein clusters
Only Refseq genomes are in scope for protein cluster analysis