Skip to content


TriNetX is a web-based tool for research population cohort and feasibility queries that also enables researchers to collaborate with peers at other member institutions.

Why use TriNetX?

Through TriNetX, users search for patients meeting specified criteria in a de-identified database, without prior Institutional Review Board (IRB) approval. Data are presented as unique patient counts, and a patient is counted only once. Data in TriNetX also exclude patients with only a medical record number or without diagnoses or codes. Such a search can help researchers determine whether enough potential patients are available to properly conduct a research study. With IRB approval and an enterprise information management request, patient-level data can be requested.

TriNetX also offers chart and graph options for data visualization and includes a rate-of-arrival algorithm. This algorithm determines how many patients matching certain criteria visited Penn State Health within the past three years, and then predicts how many potential visits will happen each quarter over the next year. A Trial Connect feature allows clinical research organizations and industry sponsors to determine and connect with potential study sites.

TriNetX is currently being used by nearly 30 of Penn State’s peers in the Clinical and Translational Science Award Program. It combines data with a global health research network, enabling health care organizations, pharmaceutical companies and contract research organizations to collaborate, enhance trial design, accelerate recruitment and bring new therapies to market faster.

Working with TriNetX

Jump to topic


Learn More

Code Lookups Expand answer

TriNetX uses International Classification of Diseases (ICD) codes, Logical Observation Identifiers Names and Codes (LOINC), and Current Procedural Terminology (CPT) codes in its searches.

The following links provide lookup tables for these codes:

Data Refresh Schedule Expand answer

The TriNetX tentative date refresh schedule for the 2024 calendar years is as follows:

Stage Refresh: 1st Saturday of each Month (6 a.m.)

Production Refresh: 2nd Saturday of each Month (Noon)

Standard Data Set Expand answer
Change Log Expand answer
Support Expand answer

For general technical help, or to report an issue with Clinical and Translational Science Institute applications including TriNetX, call 833-577-4357 or email

For account help, go to Research Population Portal.

For help with queries, please complete a service request form.

TriNetX Research Network

Penn State researchers have access to additional data outside the Penn State Health electronic medical record through TriNetX Research Network.

TriNetX Research datasets provide researchers access to de-identified patient data from a network of health care organizations.

What kinds of data come in a TriNetX dataset? Expand answer

TriNetX datasets include clinical patient data such as demographics, diagnoses, procedures, labs and medications – commonly referred to as real-world data.

The data in TriNetX datasets are:

  • Primarily from healthcare organizations electronic medical record (EMR) systems
  • Collected for the primary purpose of providing care to patients

The data in TriNetX datasets are not:

  • Claims data, data primarily collected for billing
  • Data collected for randomized clinical trials
Where do the data in a TriNetX dataset originate? Expand answer

Data in TriNetX datasets comes from health care organizations and other data providers. The data these entities provide primarily come from:

  • EMR systems
    • Structured data
    • Unstructured data processed by Natural Language Processing technology
  • Cancer registries
  • Other sources like genomic data from third-party genomic testing labs
What are the characteristics of the health care organizations that provide TriNetX with data? Expand answer

The majority of the health care organizations are large academic medical institutions with both inpatient and outpatient facilities. Most of these are adult acute-care hospitals with multiple facilities and locations. All are currently located within the United States and provide TriNetX with both inpatient and outpatient data. The data they provide represents the entire patient population at the health care organization. Most provide an average of seven years of historical data.

How are data transformed from the source? Expand answer

TriNetX typically receives data from health care organizations and other data providers in one of two ways:

  • TriNetX receives data directly from a health care organization research repository into the TriNetX environment.
  • A health care organization or data provider sends TriNetX data extracts in the form of CSV files TriNetX Data Dictionary.

TriNetX maps the data to a standard and controlled set of clinical terminologies and transforms it into a proprietary data model. This transformation process includes extensive data quality assessment that includes data cleaning that rejects records that don’t meet the TriNetX quality standards.

How current are the data? Expand answer

One of the distinguishing characteristics of the TriNetX dataset is that it is continuously refreshed. Health care organizations and other data providers update their data at various times, with over 80 percent refreshing in one-, two or four-week frequency intervals. The average lag time for a health care organization’s source data refresh is one month.

TriNetX Diamond Network

TriNetX Diamond Network includes third-party longitudinal data from ambulatory and primary care electronic medical records, medical claims from claims clearinghouses and patient medication data from pharmacy claims.

What are the key components of Diamond Network? Expand answer

The TriNetX Diamond Network data set represents 92 sites, 212 million patients and 99 percent of U.S. health plans.

Key components include:

Data Description

  • Claims (submit/remit, rejections/denials and reversals, copay, coinsurance, rebates)
  • Inpatient/outpatient care settings
  • Commercial and CMS
  • Deidentified (no obfuscation)

Types of Data

  • Patient demographics
  • Patient location
  • Diagnoses
  • Medications
  • Procedures
  • Lab results
  • Vital signs
What are the benefits of Diamond Network? Expand answer
  • Having multiple data sets to conduct analyses and gain further validation and confidence in the results
  • Providing extensive data for real-world evidence research
  • Because the Diamond Network combines multiple claims sources, it represents a broad cross-section of the U.S. population and doesn’t suffer from inherent biases of other claims datasets that represent smaller covered populations
  • Having more patient data depth than other commercially available claims data sets
What are uses for Diamond Network data? Expand answer

Data from the Diamond Network can be used for multiple research projects, including:

  • Designing protocols and feasibility analysis
  • Conducting comparative effectiveness analysis
  • Profiling target populations
  • Understanding real-world drug performance
  • Studying treatment pathways
  • Validating findings across data sets

Regional Collaborative Networks

What are regional collaborative networks? Expand answer

TriNetX supports five collaborative networks specifically for Health Care Organizations (HCOs) to anonymously pool their de-identified data. In doing so, each HCO gains research access to a broader patient population making research into rarer diseases and treatment regimens much more feasible.

HCOs participating in the TriNetX regional collaborative networks contribute their de-identified patient data to two networks:

  • the regional collaborative network corresponding to the HCO’s location (United States, Europe, Middle East and Africa, Latin America or Asia Pacific) and
  • The Global Collaborative Network, which gathers together the data from all participating HCOs

HCOs who are already members of the Research Network, which supports data downloads, are added to the regional collaborative network corresponding to their location as well as to the Global Collaborative Network.

TriNetX users from participating HCOs may:

  • Run queries and build cohorts on any regional collaborative network, including the Global Collaborative Network
  • Run basic and advanced analytics on those cohorts

HCOs on regional collaborative networks are anonymous. The platform will not display the number of patients contributed by a particular HCO (either by name or pseudonym) to any cohort. Dataset downloads are not permitted on regional collaborative networks. The Research Network does support data downloads while maintaining HCO anonymity.

Current Regional Collaborative Networks Expand answer


Number of HCOs: 74
Countries (number of HCOs): Australia (1), Brazil (5), Bulgaria (1), Germany (1), India (1), Italy (1), Lithuania (1), Malaysia (1), Poland (2), Singapore (1), Spain (3), Taiwan (1), United Kingdom (8), USA (47)
Number of Patients: 88 million

United States

Number of HCOs: 47
Countries (number of HCOs): USA (47)
Number of Patients: 73 million

Europe, Middle East and Africa

Number of HCOs: 17
Countries (number of HCOs): Bulgaria (1), Germany (1), Italy (1), Lithuania (1), Malaysia (1), Poland (2), Spain (3), United
Kingdom (8)
Number of Patients: 10.8 million

Latin America

Number of HCOs: 5
Countries (number of HCOs): Brazil (5)
Number of Patients: 2.3 million


Number of HCOs: 5
Countries (number of HCOs): Australia (1), India (1), Malaysia (1), Singapore (1), Taiwan (1)
Number of Patients: 3.2 million