Skip to content
JCDL 2004
JCDL.2004
Digital Libraries Summit
FHIR Solves the Syntax Problem and Leaves the Hard One Untouched
← All posts

FHIR Solves the Syntax Problem and Leaves the Hard One Untouched

The 21st Century Cures Act, signed into law in 2016, contains a requirement that reads simply but carries significant consequences: certified health information technology must have an application programming interface giving access to all data elements of a patient's electronic health record, "without special effort." The HL7 Fast Healthcare Interoperability Resources standard — FHIR, pronounced "fire" — was the technical answer to that requirement. By 2022, FHIR support was mandated in certified health IT. By 2025, six of the largest technology companies in the world — including Microsoft, IBM, Amazon, and Google — had pledged to remove barriers to healthcare interoperability and explicitly named FHIR as the emerging standard for health data exchange. FHIR's design logic is elegant. It defines a set of "resources" — generic, modular representations of common healthcare concepts: Patient, Observation, Practitioner, Condition, Medication — and specifies RESTful API protocols for accessing and manipulating them. The approach combines the best features of previous HL7 versions (v2, v3, and the Clinical Document Architecture) into a single specification flexible enough for mobile applications, wearable devices, EHR systems, clinical decision support tools, and research data pipelines. A FHIR-compliant system can, in principle, exchange patient data with any other FHIR-compliant system regardless of the underlying database architecture, programming language, or institutional environment. This is a genuine achievement. The structural interoperability problem — ensuring that systems can exchange data without losing its structure — is substantially solved by FHIR. What FHIR cannot solve, and what the biomedical and digital library communities are only beginning to address systematically, is the semantic interoperability problem: ensuring that the data exchanged means the same thing in the system that receives it as it did in the system that sent it.

FHIR Solves the Syntax Problem and Leaves the Hard One Untouched

Two Types of Interoperability That Are Not the Same Thing

The distinction between structural and semantic interoperability is fundamental and frequently elided in discussions of health data standards.

Structural interoperability means that a receiving system can parse the data it receives — that the format is recognised, the fields are mapped, the resource types are understood. FHIR provides this at a high level of generality. A FHIR Observation resource sent by a hospital EHR system can be received and processed by a research data warehouse without requiring custom integration code, because both systems understand the FHIR Observation structure.

Semantic interoperability means that the receiving system understands what the data represents — that the value in the "code" field of an Observation maps to the same clinical concept in the receiving system's knowledge representation as it did in the sending system. This is where FHIR, as a structural standard, reaches its limit.

The FHIR specification does not mandate which terminology systems are used to populate the code fields of its resources. It provides extensive guidance and profiling mechanisms, and it strongly recommends established clinical terminology standards — SNOMED CT for clinical findings and procedures, LOINC for laboratory and clinical measurements, RxNorm for medications. But it cannot enforce this. A hospital EHR system that populates its Observation.code fields with proprietary local codes, or with ICD-10 diagnosis codes used in ways that do not align with their intended semantics, produces FHIR resources that are structurally valid but semantically inconsistent with resources produced by a system using SNOMED CT.

The practical consequence, documented in the clinical data interoperability literature across multiple systematic reviews, is that real-world FHIR implementations — even certified, compliant ones — routinely exchange data that cannot be reliably aggregated or compared without substantial manual curation. The 2025 scoping review of interoperability as a catalyst for digital health, published in the International Journal of Environmental Research and Public Health, identifies this gap explicitly: using FHIR helps digital tools communicate more effectively, but in lower-income countries and in systems with heterogeneous legacy infrastructure, the gap between syntactic compliance and semantic consistency remains substantial.

The Vocabulary Problem at Scale

The challenge is not simply that different systems use different terminology codes. It is that the same clinical concept is legitimately represented differently across different vocabularies, and that mapping between them is a complex, professionally intensive task that cannot be fully automated.

SNOMED CT alone contains over 350,000 active concepts, organised in a polyhierarchical structure that allows a single clinical finding to be described at multiple levels of specificity. The mapping between SNOMED CT and ICD-10 — required for moving between clinical documentation and billing classification systems — is maintained by the National Library of Medicine and requires continuous expert revision as both systems evolve. The mapping between SNOMED CT and LOINC for laboratory results requires understanding of the LOINC six-part naming convention and its specific relationship to the clinical question being answered, not just the analyte being measured.

The proof-of-concept published in Sensors by Chatterjee et al. on combining HL7 FHIR with SNOMED CT to achieve semantic and structural interoperability demonstrates that the combination is technically feasible — a FHIR-compliant personal health record application that uses SNOMED CT to ensure that sensor-derived observations, clinical questionnaires, and EHR data refer to the same concepts regardless of the device or system that generated them. The approach works. It also requires the kind of careful terminology alignment, ontology mapping, and metadata quality assurance that information professionals rather than software engineers are specifically trained to perform.

The WiraChain system, published in Frontiers in Public Health in February 2026, integrates blockchain technology with FHIR for patient-controlled health record management. Its validation was conducted using synthetically generated clinical data — a methodological choice that, while appropriate for assessing technical performance, sidesteps the semantic interoperability challenge entirely. Synthetic data can be generated to be terminologically consistent by construction. Real clinical data cannot.

What the Library Community Brings to This Problem

The digital library community has been working on vocabulary management, ontology development, controlled terminology maintenance, and metadata quality assurance for longer than FHIR has existed. The tools and methods that make a library catalog semantically coherent — authority control, subject heading maintenance, crosswalk development between vocabulary systems — are structurally analogous to the tools and methods needed to make clinical data semantically interoperable at scale.

This analogy is underexplored in both communities. Clinical informatics and biomedical library science interact at PubMed and at the NLM, where MeSH and SNOMED CT coexist under the same institutional roof. But the systematic application of library-science methods to the problem of clinical data semantic interoperability — the development of shared crosswalks between FHIR terminology bindings, the design of automated metadata quality assessment pipelines for FHIR resources, the establishment of community governance structures for maintaining semantic consistency across FHIR implementations — is work that has largely not been done.

The Stanford Spezi Data Pipeline, an open-source toolkit presented at arXiv in 2025 by researchers at Stanford's Genomics Institute, takes a step in this direction: it standardises the handling of digital health data from sensor-derived observations, ECG recordings, and clinical questionnaires using FHIR representations, with explicit attention to metadata quality and analytical consistency. The pipeline's design acknowledges directly that standardising health data remains a critical challenge, particularly where operational systems and research workflows intersect.

That challenge is a library challenge as much as it is a clinical informatics challenge. The JCDL community's track on Metadata and Semantics is the right context for addressing it. The vocabulary infrastructure that makes clinical data semantically interoperable is not automatically produced by FHIR adoption. It requires the same kind of sustained, expert, institutionally supported work that has always been required to make large information systems semantically coherent.

FHIR solves the problem it was designed to solve. The harder problem is still there.

Keep reading

More from Health and Biology