Abstract
Biological data is being created at ever-increasing rates as different highthroughput technologies are implemented for a wide variety of discovery platforms. It is crucial for researchers to be able to not only access this information but also to integrate it well and synthesize new holistic ideas about various topics. A key ingredient in this process of data-driven knowledge-based discovery is the availability of databases that are user-friendly, that contain integrated information, and that are efficient at storage and retrieval of data.
Implementations of integrated databases include GenBank,77 SWISS-PROT,41 InterPro,29 PIR,903 etc. No single one of these databases contains all the information one might need to understand a specific topic. So unique databases are required to provide researchers better access to specific information. Databases can be built using a variety of tools, techniques, and approaches;416, 756, 897 but there are some methods that are extremely powerful for management of large amounts of data and that can also integrate this information.
We have created several purpose-built integrated databases. We describe one of them—the Protein Phosphatase DataBase (PPDB)—in this chapter. PPDB has been constructed using a data integration and analysis system called Kleisli. Kleisli can model complex biological data and their relationships, and integrate information from distributed and heterogeneous data resources.162, 189, 894
Implementations of integrated databases include GenBank,77 SWISS-PROT,41 InterPro,29 PIR,903 etc. No single one of these databases contains all the information one might need to understand a specific topic. So unique databases are required to provide researchers better access to specific information. Databases can be built using a variety of tools, techniques, and approaches;416, 756, 897 but there are some methods that are extremely powerful for management of large amounts of data and that can also integrate this information.
We have created several purpose-built integrated databases. We describe one of them—the Protein Phosphatase DataBase (PPDB)—in this chapter. PPDB has been constructed using a data integration and analysis system called Kleisli. Kleisli can model complex biological data and their relationships, and integrate information from distributed and heterogeneous data resources.162, 189, 894
Original language | English |
---|---|
Pages (from-to) | 401-416 |
Number of pages | 16 |
Journal | The Practical Bioinformatician |
Publication status | Published - 2004 |
Externally published | Yes |