TriNetX is a web-based tool for research population cohort and feasibility queries that also enables researchers to collaborate with peers at other member institutions.
- Training: Training is requested by completing an Institute Service Request Form
- Password resets: Request help through the Research Population Portal
- Additional information: Email ctsi@pennstatehealth.psu.edu
Why use TriNetX?
Through TriNetX, users search for patients meeting specified criteria in a de-identified database, without prior Institutional Review Board (IRB) approval. Data are presented as unique patient counts, and a patient is counted only once. Data in TriNetX also exclude patients with only a medical record number or without diagnoses or codes. Such a search can help researchers determine whether enough potential patients are available to properly conduct a research study. With IRB approval and an enterprise information management request, patient-level data can be requested.
TriNetX also offers chart and graph options for data visualization and includes a rate-of-arrival algorithm. This algorithm determines how many patients matching certain criteria visited Penn State Health within the past three years, and then predicts how many potential visits will happen each quarter over the next year. A Trial Connect feature allows clinical research organizations and industry sponsors to determine and connect with potential study sites.
TriNetX is currently being used by nearly 30 of Penn State’s peers in the Clinical and Translational Science Award Program. It combines data with a global health research network, enabling health care organizations, pharmaceutical companies and contract research organizations to collaborate, enhance trial design, accelerate recruitment and bring new therapies to market faster.
Working with TriNetX
Jump to topic
Search
Learn More
TriNetX uses International Classification of Diseases (ICD) codes, Logical Observation Identifiers Names and Codes (LOINC), and Current Procedural Terminology (CPT) codes in its searches.
The following links provide lookup tables for these codes:
- ICD-9 Diagnosis and Inpatient Procedures (for data prior to October 2015)
- ICD-10 Diagnosis and Inpatient Procedures (for data after October 2015)
- LOINC
- CPT codes (for outpatient procedures)
The TriNetX tentative date refresh schedule for the 2024 calendar years is as follows:
Stage Refresh: 1st Saturday of each Month (6 a.m.)
Production Refresh: 2nd Saturday of each Month (Noon)
For general technical help, or to report an issue with Clinical and Translational Science Institute applications including TriNetX, call 833-577-4357 or email help@pennstatehealth.psu.edu.
For account help, go to Research Population Portal.
For help with queries, please complete a service request form.
TriNetX Research Network
Penn State researchers have access to additional data outside the Penn State Health electronic medical record through TriNetX Research Network.
TriNetX Research datasets provide researchers access to de-identified patient data from a network of health care organizations.
TriNetX datasets include clinical patient data such as demographics, diagnoses, procedures, labs and medications – commonly referred to as real-world data.
The data in TriNetX datasets are:
- Primarily from healthcare organizations electronic medical record (EMR) systems
- Collected for the primary purpose of providing care to patients
The data in TriNetX datasets are not:
- Claims data, data primarily collected for billing
- Data collected for randomized clinical trials
Data in TriNetX datasets comes from health care organizations and other data providers. The data these entities provide primarily come from:
- EMR systems
- Structured data
- Unstructured data processed by Natural Language Processing technology
- Cancer registries
- Other sources like genomic data from third-party genomic testing labs
The majority of the health care organizations are large academic medical institutions with both inpatient and outpatient facilities. Most of these are adult acute-care hospitals with multiple facilities and locations. All are currently located within the United States and provide TriNetX with both inpatient and outpatient data. The data they provide represents the entire patient population at the health care organization. Most provide an average of seven years of historical data.
TriNetX typically receives data from health care organizations and other data providers in one of two ways:
- TriNetX receives data directly from a health care organization research repository into the TriNetX environment.
- A health care organization or data provider sends TriNetX data extracts in the form of CSV files TriNetX Data Dictionary.
TriNetX maps the data to a standard and controlled set of clinical terminologies and transforms it into a proprietary data model. This transformation process includes extensive data quality assessment that includes data cleaning that rejects records that don’t meet the TriNetX quality standards.
One of the distinguishing characteristics of the TriNetX dataset is that it is continuously refreshed. Health care organizations and other data providers update their data at various times, with over 80 percent refreshing in one-, two or four-week frequency intervals. The average lag time for a health care organization’s source data refresh is one month.
TriNetX Diamond Network
TriNetX Diamond Network includes third-party longitudinal data from ambulatory and primary care electronic medical records, medical claims from claims clearinghouses and patient medication data from pharmacy claims.
The TriNetX Diamond Network data set represents 92 sites, 212 million patients and 99 percent of U.S. health plans.
Key components include:
Data Description
- Claims (submit/remit, rejections/denials and reversals, copay, coinsurance, rebates)
- Inpatient/outpatient care settings
- Commercial and CMS
- Deidentified (no obfuscation)
Types of Data
- Patient demographics
- Patient location
- Diagnoses
- Medications
- Procedures
- Lab results
- Vital signs
- Having multiple data sets to conduct analyses and gain further validation and confidence in the results
- Providing extensive data for real-world evidence research
- Because the Diamond Network combines multiple claims sources, it represents a broad cross-section of the U.S. population and doesn’t suffer from inherent biases of other claims datasets that represent smaller covered populations
- Having more patient data depth than other commercially available claims data sets
Data from the Diamond Network can be used for multiple research projects, including:
- Designing protocols and feasibility analysis
- Conducting comparative effectiveness analysis
- Profiling target populations
- Understanding real-world drug performance
- Studying treatment pathways
- Validating findings across data sets
Regional Collaborative Networks
TriNetX supports five collaborative networks specifically for Health Care Organizations (HCOs) to anonymously pool their de-identified data. In doing so, each HCO gains research access to a broader patient population making research into rarer diseases and treatment regimens much more feasible.
HCOs participating in the TriNetX regional collaborative networks contribute their de-identified patient data to two networks:
- the regional collaborative network corresponding to the HCO’s location (United States, Europe, Middle East and Africa, Latin America or Asia Pacific) and
- The Global Collaborative Network, which gathers together the data from all participating HCOs
HCOs who are already members of the Research Network, which supports data downloads, are added to the regional collaborative network corresponding to their location as well as to the Global Collaborative Network.
TriNetX users from participating HCOs may:
- Run queries and build cohorts on any regional collaborative network, including the Global Collaborative Network
- Run basic and advanced analytics on those cohorts
HCOs on regional collaborative networks are anonymous. The platform will not display the number of patients contributed by a particular HCO (either by name or pseudonym) to any cohort. Dataset downloads are not permitted on regional collaborative networks. The Research Network does support data downloads while maintaining HCO anonymity.
Global
Number of HCOs: 74
Countries (number of HCOs): Australia (1), Brazil (5), Bulgaria (1), Germany (1), India (1), Italy (1), Lithuania (1), Malaysia (1), Poland (2), Singapore (1), Spain (3), Taiwan (1), United Kingdom (8), USA (47)
Number of Patients: 88 million
United States
Number of HCOs: 47
Countries (number of HCOs): USA (47)
Number of Patients: 73 million
Europe, Middle East and Africa
Number of HCOs: 17
Countries (number of HCOs): Bulgaria (1), Germany (1), Italy (1), Lithuania (1), Malaysia (1), Poland (2), Spain (3), United
Kingdom (8)
Number of Patients: 10.8 million
Latin America
Number of HCOs: 5
Countries (number of HCOs): Brazil (5)
Number of Patients: 2.3 million
Asia-Pacific
Number of HCOs: 5
Countries (number of HCOs): Australia (1), India (1), Malaysia (1), Singapore (1), Taiwan (1)
Number of Patients: 3.2 million