TriNetX

TriNetX is a web-based tool for research population cohort and feasibility queries that also enables researchers to collaborate with peers at other member institutions.

Training: Training is requested by completing an Institute Service Request Form
Password resets: Request help through the Research Population Portal
Additional information: Email ctsi@pennstatehealth.psu.edu

Why use TriNetX?

Through TriNetX, users search for patients meeting specified criteria in a de-identified database, without prior Institutional Review Board (IRB) approval. Data are presented as unique patient counts, and a patient is counted only once. Data in TriNetX also exclude patients with only a medical record number or without diagnoses or codes. Such a search can help researchers determine whether enough potential patients are available to properly conduct a research study. With IRB approval and an enterprise information management request, patient-level data can be requested.

TriNetX also offers chart and graph options for data visualization and includes a rate-of-arrival algorithm. This algorithm determines how many patients matching certain criteria visited Penn State Health within the past three years, and then predicts how many potential visits will happen each quarter over the next year. A Trial Connect feature allows clinical research organizations and industry sponsors to determine and connect with potential study sites.

TriNetX is currently being used by nearly 30 of Penn State’s peers in the Clinical and Translational Science Award Program. It combines data with a global health research network, enabling health care organizations, pharmaceutical companies and contract research organizations to collaborate, enhance trial design, accelerate recruitment and bring new therapies to market faster.

Working with TriNetX

Request access, training and support for TriNetX and other research population tools

Jump to topic

Search

Learn More

Code Lookups

TriNetX uses International Classification of Diseases (ICD) codes, Logical Observation Identifiers Names and Codes (LOINC), and Current Procedural Terminology (CPT) codes in its searches.

The following links provide lookup tables for these codes:

ICD-9 Diagnosis and Inpatient Procedures (for data prior to October 2015)
ICD-10 Diagnosis and Inpatient Procedures (for data after October 2015)
LOINC
CPT codes (for outpatient procedures)

Data Refresh Schedule

The TriNetX tentative date refresh schedule for the 2024 calendar years is as follows:

Stage Refresh: 1st Saturday of each Month (6 a.m.)

Production Refresh: 2nd Saturday of each Month (Noon)

Standard Data Set

Change Log

Support

For general technical help, or to report an issue with Clinical and Translational Science Institute applications including TriNetX, call 833-577-4357 or email help@pennstatehealth.psu.edu.

For account help, go to Research Population Portal.

For help with queries, please complete a service request form.

TriNetX Research Network

Penn State researchers have access to additional data outside the Penn State Health electronic medical record through TriNetX Research Network.

TriNetX Research datasets provide researchers access to de-identified patient data from a network of health care organizations.

What kinds of data come in a TriNetX dataset?

TriNetX datasets include clinical patient data such as demographics, diagnoses, procedures, labs and medications – commonly referred to as real-world data.

The data in TriNetX datasets are:

Primarily from healthcare organizations electronic medical record (EMR) systems
Collected for the primary purpose of providing care to patients

The data in TriNetX datasets are not:

Claims data, data primarily collected for billing
Data collected for randomized clinical trials

Where do the data in a TriNetX dataset originate?

Data in TriNetX datasets comes from health care organizations and other data providers. The data these entities provide primarily come from:

EMR systems
- Structured data
- Unstructured data processed by Natural Language Processing technology
Cancer registries
Other sources like genomic data from third-party genomic testing labs

What are the characteristics of the health care organizations that provide TriNetX with data?

The majority of the health care organizations are large academic medical institutions with both inpatient and outpatient facilities. Most of these are adult acute-care hospitals with multiple facilities and locations. All are currently located within the United States and provide TriNetX with both inpatient and outpatient data. The data they provide represents the entire patient population at the health care organization. Most provide an average of seven years of historical data.

How are data transformed from the source?

TriNetX typically receives data from health care organizations and other data providers in one of two ways:

TriNetX receives data directly from a health care organization research repository into the TriNetX environment.
A health care organization or data provider sends TriNetX data extracts in the form of CSV files TriNetX Data Dictionary.

TriNetX maps the data to a standard and controlled set of clinical terminologies and transforms it into a proprietary data model. This transformation process includes extensive data quality assessment that includes data cleaning that rejects records that don’t meet the TriNetX quality standards.

How current are the data?

One of the distinguishing characteristics of the TriNetX dataset is that it is continuously refreshed. Health care organizations and other data providers update their data at various times, with over 80 percent refreshing in one-, two or four-week frequency intervals. The average lag time for a health care organization’s source data refresh is one month.

TriNetX Diamond Network

TriNetX Diamond Network includes third-party longitudinal data from ambulatory and primary care electronic medical records, medical claims from claims clearinghouses and patient medication data from pharmacy claims.

What are the key components of Diamond Network?

The TriNetX Diamond Network data set represents 92 sites, 212 million patients and 99 percent of U.S. health plans.

Key components include:

Data Description

Claims (submit/remit, rejections/denials and reversals, copay, coinsurance, rebates)
Inpatient/outpatient care settings
Commercial and CMS
Deidentified (no obfuscation)

Types of Data

Patient demographics
Patient location
Diagnoses
Medications
Procedures
Lab results
Vital signs

What are the benefits of Diamond Network?

Having multiple data sets to conduct analyses and gain further validation and confidence in the results
Providing extensive data for real-world evidence research
Because the Diamond Network combines multiple claims sources, it represents a broad cross-section of the U.S. population and doesn’t suffer from inherent biases of other claims datasets that represent smaller covered populations
Having more patient data depth than other commercially available claims data sets

What are uses for Diamond Network data?

Data from the Diamond Network can be used for multiple research projects, including:

Designing protocols and feasibility analysis
Conducting comparative effectiveness analysis
Profiling target populations
Understanding real-world drug performance
Studying treatment pathways
Validating findings across data sets

Regional Collaborative Networks

What are regional collaborative networks?

TriNetX supports five collaborative networks specifically for Health Care Organizations (HCOs) to anonymously pool their de-identified data. In doing so, each HCO gains research access to a broader patient population making research into rarer diseases and treatment regimens much more feasible.

HCOs participating in the TriNetX regional collaborative networks contribute their de-identified patient data to two networks:

the regional collaborative network corresponding to the HCO’s location (United States, Europe, Middle East and Africa, Latin America or Asia Pacific) and
The Global Collaborative Network, which gathers together the data from all participating HCOs

HCOs who are already members of the Research Network, which supports data downloads, are added to the regional collaborative network corresponding to their location as well as to the Global Collaborative Network.

TriNetX users from participating HCOs may:

Run queries and build cohorts on any regional collaborative network, including the Global Collaborative Network
Run basic and advanced analytics on those cohorts

HCOs on regional collaborative networks are anonymous. The platform will not display the number of patients contributed by a particular HCO (either by name or pseudonym) to any cohort. Dataset downloads are not permitted on regional collaborative networks. The Research Network does support data downloads while maintaining HCO anonymity.

Current Regional Collaborative Networks

Global

Number of HCOs: 74
Countries (number of HCOs): Australia (1), Brazil (5), Bulgaria (1), Germany (1), India (1), Italy (1), Lithuania (1), Malaysia (1), Poland (2), Singapore (1), Spain (3), Taiwan (1), United Kingdom (8), USA (47)
Number of Patients: 88 million

United States

Number of HCOs: 47
Countries (number of HCOs): USA (47)
Number of Patients: 73 million

Europe, Middle East and Africa

Number of HCOs: 17
Countries (number of HCOs): Bulgaria (1), Germany (1), Italy (1), Lithuania (1), Malaysia (1), Poland (2), Spain (3), United
Kingdom (8)
Number of Patients: 10.8 million

Latin America

Number of HCOs: 5
Countries (number of HCOs): Brazil (5)
Number of Patients: 2.3 million

Asia-Pacific

Number of HCOs: 5
Countries (number of HCOs): Australia (1), India (1), Malaysia (1), Singapore (1), Taiwan (1)
Number of Patients: 3.2 million