TriNetX is a web-based tool for research population cohort and feasibility queries that also enables researchers to collaborate with peers at other member institutions.
- Training: Training is requested by completing an Institute Service Request Form
- Password Resets: Request help through the Research Population Portal
- Additional information: Email us
Why use TriNetX?
Through TriNetX, users search for patients meeting specified criteria in a de-identified database, without prior Institutional Review Board (IRB) approval. Data are presented as unique patient counts, and a patient is counted only once. Data in TriNetX also exclude patients with only a medical record number or without diagnoses or codes. Such a search can help researchers determine whether enough potential patients are available to properly conduct a research study. With IRB approval and an enterprise information management request, patient-level data can be requested.
TriNetX also offers chart and graph options for data visualization and includes a rate-of-arrival algorithm. This algorithm determines how many patients matching certain criteria visited Penn State Health within the past three years, and then predicts how many potential visits will happen each quarter over the next year. A Trial Connect feature allows clinical research organizations and industry sponsors to determine and connect with potential study sites.
TriNetX is currently being used by nearly 30 of Penn State’s peers in the Clinical and Translational Science Award Program. It combines data with a global health research network, enabling health care organizations, pharmaceutical companies and contract research organizations to collaborate, enhance trial design, accelerate recruitment and bring new therapies to market faster.
Working with TriNetX
Jump to topic
TriNetX uses International Classification of Diseases (ICD) codes, Logical Observation Identifiers Names and Codes (LOINC), and Current Procedural Terminology (CPT) codes in its searches.
The following links provide lookup tables for these codes:
The TriNetX tentative data refresh schedule is listed here. Please note that the application will be unavailable between 8 p.m. Tuesday and 5 p.m. Wednesday on the dates listed unless otherwise noted. Please see the change log section of this page for the current state of the data.
For account help, go to Research Population Portal.
For help with queries, please complete a service request form.
TriNetX Research Network
Penn State researchers have access to additional data outside the Penn State Health electronic medical record through TriNetX Research Network.
TriNetX Research datasets provide researchers access to de-identified patient data from a network of health care organizations.
TriNetX datasets include clinical patient data such as demographics, diagnoses, procedures, labs and medications – commonly referred to as real-world data.
The data in TriNetX datasets are:
- Primarily from healthcare organizations electronic medical record (EMR) systems
- Collected for the primary purpose of providing care to patients
The data in TriNetX datasets are not:
- Claims data, data primarily collected for billing
- Data collected for randomized clinical trials
Data in TriNetX datasets comes from health care organizations and other data providers. The data these entities provide primarily come from:
- EMR systems
- Structured data
- Unstructured data processed by Natural Language Processing technology
- Cancer registries
- Other sources like genomic data from third-party genomic testing labs
The majority of the health care organizations are large academic medical institutions with both inpatient and outpatient facilities. Most of these are adult acute-care hospitals with multiple facilities and locations. All are currently located within the United States and provide TriNetX with both inpatient and outpatient data. The data they provide represents the entire patient population at the health care organization. Most provide an average of seven years of historical data.
TriNetX typically receives data from health care organizations and other data providers in one of two ways:
- TriNetX receives data directly from a health care organization research repository into the TriNetX environment.
- A health care organization or data provider sends TriNetX data extracts in the form of CSV files TriNetX Data Dictionary.
TriNetX maps the data to a standard and controlled set of clinical terminologies and transforms it into a proprietary data model. This transformation process includes extensive data quality assessment that includes data cleaning that rejects records that don’t meet the TriNetX quality standards.
One of the distinguishing characteristics of the TriNetX dataset is that it is continuously refreshed. Health care organizations and other data providers update their data at various times, with over 80 percent refreshing in one-, two or four-week frequency intervals. The average lag time for a health care organization’s source data refresh is one month.
TriNetX Diamond Network
TriNetX Diamond Network includes third-party longitudinal data from ambulatory and primary care electronic medical records, medical claims from claims clearinghouses and patient medication data from pharmacy claims.
The TriNetX Diamond Network data set represents 1.8 million sites, 190 million patients and 99 percent of U.S. health plans. The longitudinal data extends back to 2014 and is refreshed quarterly.
Key components include:
- Claims (submit/remit, rejections/denials and reversals, copay, coinsurance, rebates)
- Inpatient/outpatient care settings
- Commercial and CMS
- Deidentified (no obfuscation)
Types of Data
- Patient demographics
- Patient location
- Lab results
- Vital signs
- Having multiple data sets to conduct analyses and gain further validation and confidence in the results
- Providing extensive data for real-world evidence research
- Because the Diamond Network combines multiple claims sources, it represents a broad cross-section of the U.S. population and doesn’t suffer from inherent biases of other claims datasets that represent smaller covered populations
- Having more patient data depth than other commercially available claims data sets
Data from the Diamond Network can be used for multiple research projects, including:
- Designing protocols and feasibility analysis
- Conducting comparative effectiveness analysis
- Profiling target populations
- Understanding real-world drug performance
- Studying treatment pathways
- Validating findings across data sets