A thoracic oncology database was developed to serve as a comprehensive repository for clinical and laboratory data for the purposes of translational research. The database will serve translational cancer researchers within the Thoracic Oncology Research Program. This database is adaptable to other cancer models, as well as other human diseases.
The Thoracic Oncology Program Database Project was created to serve as a comprehensive, verified, and accessible repository for well-annotated cancer specimens and clinical data to be available to researchers within the Thoracic Oncology Research Program. This database also captures a large volume of genomic and proteomic data obtained from various tumor tissue studies. A team of clinical and basic science researchers, a biostatistician, and a bioinformatics expert was convened to design the database. Variables of interest were clearly defined and their descriptions were written within a standard operating manual to ensure consistency of data annotation. Using a protocol for prospective tissue banking and another protocol for retrospective banking, tumor and normal tissue samples from patients consented to these protocols were collected. Clinical information such as demographics, cancer characterization, and treatment plans for these patients were abstracted and entered into an Access database. Proteomic and genomic data have been included in the database and have been linked to clinical information for patients described within the database. The data from each table were linked using the relationships function in Microsoft Access to allow the database manager to connect clinical and laboratory information during a query. The queried data can then be exported for statistical analysis and hypothesis generation.
1. University Clinical Research Protocols:
2. Clinical Data Collection Protocol:
3. Specimen Collection Protocol:
Tissue samples
Blood samples
Other body fluids:
4. Building the Informatics Infrastructure:
5. Designing the Contents of Each Table:
6. Establishing Relationships Among Tables:
7. Querying:
8. Exporting Data:
9. Importing Data:
10. Updating the Database:
11. Access to the Database:
12. Representative Results:
A researcher may be interested in knowing the clinical significance of over-expression of the protein Paxillin in non-small cell lung cancer. As this researcher has generated a great deal of TMA data in the database for Paxillin, the data manager approves the researcher’s request to access clinical information to correlate with the laboratory data. The data manager runs a query where he combines both the Patients Table and the TMA Table. Variables of interest from the Patients table include the patient’s date of birth, their race, the histology of their cancer, the stage of their cancer, their date of diagnosis, their vital status, their date of death, and their date of last contact. Using these variables, such as age at diagnosis and stage, important confounders can be accounted for and controlled. From the TMA table, important information such as the tumor type and the protein expression can be ascertained.
As the two tables are linked via the medical record number, patient information from individuals whose tumors have been studied for Paxillin expression are included in the output. The results can be filtered so that only patients with non-small cell lung cancer are displayed. The results can be further refined based on the needs of the researcher.
These results can be exported for primary data analysis by the biostatistician and the results are then shared with the researcher.
Project home page: Access Database template and Standard Operating Procedure are available at:
http://www.ibridgenetwork.org/uctech/salgia-thoracic-oncology-access-template
License: Freely available for academic and non-profit use.
Restrictions to use by non-academics: Commercial users require a license. For questions regarding commercial uses, please contact The University of Chicago’s Office of Technology and Intellectual Property (UChicagoTech) at (773) 702-1692 or www.tech.uchicago.edu
Figure 1. A screenshot of the Access Database depicting a section of the Patients Table.
Figure 2. Schematic depicting a tissue microarray (TMA)2
Figure 3. Screenshot depicting relationships established among tables within the Access database. Tables are linked via primary keys.
Figure 4. Sample query for Paxillin mutation, TMA results, and clinical variables.
The authors have nothing to disclose.
This work was supported by NIH grants 5R01CA100750-07, 5R01CA125541-04, 3R01CA125541-03S1, 5R01CA129501-03, 3R01CA129501-02S1 to RS
Material Name | Company | Catalogue Number |
Centrifuge | Eppendorf | |
Conical centrifuge tube | Falcon | 518-PG |
Minimum essential medium eagle (MEM) | Sigma | M4655-500ML |
Fetal Calf Serum | Cellgro FBS HI | MTT35011CV |
Dimethyl Sulfoxide (DMSO) | American Bioanalytical | AB03091 |
BD Vacutainer Serum Tubes | Fischer Scientific | 367815 |