Considering Protein Interaction Sites in Co-Expression Networks and a Tool Called Intersite
December 2007
Introduction
Structural Genomics examines the link between sequences (Genomics) and Function (directly caused by Structure). In this field there is a balance between sequence analysis, database mining, and protein structure consideration. This project is an attempt to continue that balance by applying information pertaining to sequence and structure found in GenBank entries and human knowledge to co-expression network analysis. While there is not three-dimensional calculation or prediction, it shows that there are other types of important structural data out there to be considered in many subfields of Bioinformatics.
In order to analyze a co-expression network of Affy probes, putative protein product structure was considered via number of interaction sites. A web based tool called Intersite was developed as a Group Decision Support System for annotating the interaction sites on proteins.
Discussion
This project was an interesting exercise in utilizing multiple Bioinformatics tools (and creating some new ones) in order to ask and answer questions relating sequence and structure. Future work may include separate analysis of the groups of VV probes separated by their associated proteins’ numbers of interaction sites. Also, like in [1], between-ness, closeness, hub-ness, and hierarchy level of the proteins in the pathways can also be calculated.
A useful outcome from the project is the ability to build protein networks based on interaction site similarity and visualize such clusters. For the data used here, nucleus, membrane, and secreted proteins share unique interaction site sequences. This is probably due to the fact that nucleus proteins must be delivered to the nucleus using signal sequences and special chaperone interaction sites. The same is true for secreted proteins and membrane proteins. Further analysis could consider the rest of the locations specified for proteins with multiple subcellular location values.
References
[1] Gerstein, Mark. Presentation. “Understanding Protein Function on a Genome scale using networks.” First Annual Midwest Computational Biology and Bioinformatics Symposium, Northwestern University.
[2] Kim PM, Lu LJ, Xia Y, Gerstein MB. “Relating three-dimensional structures to protein networks provides evolutionary insights.” Science. 2006 Dec 22; 314(5807):1938-41.
[4] Finn, et al. “Pfam: clans, web tools and services” Bioinformatics. 35:D 2005.
[3] Finn, Marhsall, Bateman. “iPfam: visualization of protein-protein interactions in PDB at domain and amino acid resolutions”. Bioinformatics. 21: 3 2005.
[5] Cramer GR, Ergül A, Grimplet J, Tillett RL, Tattersall EA, Bohlman MC, Vincent D, Sonderegger J, Evans J, Osborne C, Quilici D, Schlauch KA, Schooley DA, Cushman JC. "Water and salinity stress in grapevines: early and late changes in transcript and metabolite profiles.“ Funct Integr Genomics. 2007 Apr;7(2):111-34. Epub 2006 Nov 29.
[6]de la Fuente, et al. Bioinformatics. Vol. 20, No. 18. Pp. 3565-3574, 2004.
[8] Wu, Cathy H, et al. “The iProClass integerated database for protein functional analysis.” Computational Biology and Chemistry. 28 (2004) 87-96.
[9] Liu, Hongfang, et al. “BioThesaurus: a web-based thesaurus of protein and gene names.” Bioinformatics. 22: 1 2006. pp103-105.
