Tese de Doutoramento
Illuminating the Dark Proteome
2017
—Informações chave
Autores:
Orientadores:
Publicado em
06/01/2017
Resumo
Molecular models of a protein’s structure can give detailed insight into mechanisms underlying its function, especially when viewed in combination with sequence features. In theory, 3D structural models are now available for many proteins, however in practice it is often complex to find all appropriate models and view them with sequence features. Thus, we developed Aquaria, a new web resource that provides 46 million pre- calculated structural models using homology from sequence to structure – 10 times more than currently available from other resources, resulting in at least one matching structure for 87% of Swiss-Prot proteins and a median of 35 structures per protein Using Aquaria, we surveyed the known or visible proteome. Its complement, the ‘unknown’ or ‘dark’ proteome, i.e., regions of proteins that remain stubbornly inaccessible to both experimental structure determination and modeling, was scanned, stored and indexed into the Dark Proteome Database. Using the above systems, it was performed the most recent structural modeling study covering 546,000 proteins across many organisms, where it was found 44–54% of the proteome in eukaryotes and viruses is dark, compared with only 14% for archaea and bacteria. Surprisingly, most of the dark proteome could not be accounted for by conventional explanations, such as intrinsic disorder, transmembrane regions or compositional bias. Nearly half of the dark proteome comprised dark proteins, in which the entire sequence lacked similarity to any known structure. Dark proteins fulfill a wide variety of functions, but a subset showed distinct and largely unexpected features, such as association with secretion, specific tissues, the endoplasmic reticulum, disulfide bonding, and proteolytic cleavage. Dark proteins also had short sequence length, low evolutionary reuse, and few known interactions with other proteins. This thesis also suggests the existence of transmembrane regions undetected by current prediction methods. Therefore, our work suggests several new directions for research in structural and computational biology. This work surely will help focus the efforts of future research to shed light on the remaining dark proteome thus potentially revealing molecular processes of life that are currently unknown.
Detalhes da publicação
Autores da comunidade :
Nelson Perdigão
ist31928
Orientadores desta instituição:
Agostinho Cláudio da Rosa
ist11812
Domínio Científico (FOS)
electrical-engineering-electronic-engineering-information-engineering - Engenharia Eletrotécnica, Eletrónica e Informática
Palavras-chave
- Big Data
- Databases
- Homology
- Proteins
- Structure.
Idioma da publicação (código ISO)
eng - Inglês
Acesso à publicação:
Embargo levantado
Data do fim do embargo:
01/12/2017
Nome da instituição
Instituto Superior Técnico