Foundations of Comparative Genomics by Mushegian Arcady R.;

Foundations of Comparative Genomics by Mushegian Arcady R.;

Author:Mushegian, Arcady R.; [Mushegian, Arcady R.]
Language: eng
Format: epub
Publisher: Elsevier Science
Published: 2007-12-15T00:00:00+00:00


10

How Many Protein Families are There?

Publisher Summary

This chapter discusses the issue of how one can define protein family. A natural way to do so is to use homology: a family is such a set of proteins that every protein in it is homologous to all other proteins in the same set and has no homologous proteins outside of this set. In practice, however, there are at least two difficulties with this definition. One technical difficulty is that proteins consist of domains that can be fused and split, and in many cases, homologous relationships characterize protein domains rather than whole proteins. Thus, all operations on families have to take the domain organization of proteins into account. A much larger problem is that, in general, the complete set of homologs of any protein is unknown. No method to find all the homologs in a protein is perfect, and the inference of homology is a statistical one, with associated rates of false negatives and false discoveries. There is always a chance of missing some homologs, especially the highly diverged ones. Obviously, those homologous genes that have not been sequenced yet will also be missed. Thus, every known protein family is really a sample of the true family.



Download



Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.