Skip to main content
GiveReport a Concern

Data Resources

Digital Collaboratory for Precision Health Research (DCPHR)

The Digital Collaboratory for Precision Health Research (DCPHR) combines the efforts of CTSI’s informatics team and the Center for Artificial Intelligence Foundations in Scientific Applications. Together with the Institute for Computational and Data Sciences, the Social Science Research Institute, and the Health Spoke of the National Science Foundation’s Northeast Big Data Hub, the DCPHR provides access to large data sets via several discovery tools and provides researchers with the necessary artificial intelligence (AI) and machine learning (ML) stacks to properly use those data sets.

Observational Medical Outcomes Partnership (OMOP)

The DCPHR currently provides access to deidentified Penn State Health Electronic Health Record (EHR) data standardized using the OMOP common data model (CDM), which was developed by the NIH-funded Observational Health Data Sciences Initiative (or OHDSI, pronounced "Odyssey") consortium. 

The OHDSI consortium links 3,266 collaborators across approximately 75 countries with OMOP-based EHR data repositories that collectively contain 928 million unique patient records (representing about 12% of the world’s population).

TriNetX

TriNetX allows Penn State researchers to access Penn State Health EHR data, de-identified data from the TriNetX Research network (from over 71 healthcare organizations), de-identified claims data from the Diamond Network (from 92 organizations), and de-identified claims data from the COVID-19 Research Network (from 78 additional organizations). Researchers can define study cohorts of interest by querying TriNetX networks based on medications, diagnoses, demographics, lab results, genomics, mortality, oncology, procedures, etc.

Evolve to Next-Gen Accrual to Clinical Trials (enACT) Network

The enACT Network, supported by a grant from the National Center for Advancing Translational Science, provides investigators with access to electronic health records from over 140 million patients across the CTSA consortium in minutes from a desktop computer.

“PaTH to Health” Data

The “PaTH to Health” data meet the national data standard and are harmonized with data from more than 70 PCORnet clinical sites. This data repository is refreshed regularly (currently every other week), checked for data quality assurance quarterly, and has been used as a data infrastructure that supports several successful multi-site grants applications. This clinical research data repository splits all patient-level and encounter-level data into multiple tables using pseudo-identifiers in a HIPPA-compliant manner.

National COVID Cohort Collaborative (N3C)

Through the National COVID Cohort Collaborative (N3C), researchers studying COVID-19 are able to access an innovative analytics platform that contains clinical data from the electronic health records of people who were tested for the novel coronavirus or who have had related symptoms.

Part of the National Center for Advancing Translational Sciences (NCATS) National COVID Cohort Collaborative Data Enclave, the centralized and secure data platform features powerful analytics capabilities for online discovery, visualization, and collaboration.

The data are robust in scale and scope and are transformed into a harmonized data set to help scientists study COVID 19, including potential risk factors, protective factors, and long-term health consequences.

Data analysis within the enclave is supported by both R and Python, the most widely used open-source platforms for statistical analysis and data science. Researchers requesting access to, or working within, the enclave are encouraged to assemble collaborative teams with diverse expertise in such areas as clinical research, statistical analysis, and informatics to make the best use of the N3C Data Enclave.

Note: Penn State Clinical and Translational Science Institute helped secure a data use agreement for the University, and that step of enrollment can be skipped.

Contacts

Avnish Katoch

Project Manager, Informatics
email iconaxk54@psu.edu

Tyler Deal, MPH

Project Coordinator, Informatics
email icontjd5747@psu.edu

Vasant Honavar, PhD

Co-lead, Informatics
University Park, PA
email iconvuh14@psu.edu

Wenke Hwang, PhD, MA

Co-Lead, Informatics
Hershey, PA
email iconwwh12@psu.edu