A 2024 study from researchers at TUM University Hospital and Imperial College London demonstrated that fine-tuned language models can leak personally identifiable information that appeared only in input prompts during training — never in outputs the model was supervised to produce. In healthcare, that finding lands hard. It means a discharge-summary model fine-tuned on a hospital’s own records can quietly memorize the patients whose notes happened to scroll past during gradient updates, then surface fragments of those notes when prompted by a stranger. The data the team thought was used and discarded was actually absorbed.
This is the gap HIPAA was not designed to police. The Privacy Rule was written when “use” of a record meant a clinician opened it; “disclosure” meant a fax went out. Protected Health Information moving through a stochastic gradient descent loop fits neither category cleanly, and the regulatory machinery is still catching up. What follows is a working map of where HIPAA actually applies to ML training pipelines today, what counts as PHI once data hits a tokenizer, and the specific failure modes that turn a compliant data pipeline into a reportable breach.
What Counts as PHI Inside a Training Pipeline
Protected Health Information under 45 CFR §164.502 is any individually identifiable health information held or transmitted by a covered entity or business associate, in any form. The “any form” matters. Tensors are a form. Embeddings are a form. A model checkpoint that has memorized a verbatim chart note is a form. The question is not whether the data has been transformed — it’s whether the original individual is still reasonably identifiable from what remains.
Once data is properly de-identified under HIPAA’s two recognized methods, it falls outside the rule entirely and can be used for ML training without a Business Associate Agreement, authorization, or minimum-necessary analysis. Get de-identification wrong, and every downstream artifact — training set, validation set, fine-tuned weights, generated outputs — inherits PHI status.
Vendors that ingest PHI on behalf of a covered entity are business associates under HIPAA, and a Business Associate Agreement is mandatory before any data flows. This includes LLM developers when they process PHI for a hospital, AI coding-assistant vendors whose tools are pasted patient notes, and any cloud-hosted ML platform where training jobs touch unredacted records. Sharing PHI with a vendor without a signed BAA is a textbook violation; one clinic paid a $750,000 penalty for releasing PHI before its BAA was countersigned.
The Two De-Identification Paths and Why Neither Was Built for ML
HIPAA recognizes exactly two routes to strip PHI of its protected status: Safe Harbor (45 CFR §164.514(b)(2)) and Expert Determination (45 CFR §164.514(b)(1)). Both predate transformer models by roughly two decades, and the seams show.
Safe Harbor is a checklist. Remove all 18 enumerated identifiers, attest that you have no actual knowledge the residual data could re-identify anyone, and the dataset is no longer PHI. The list is concrete and auditable, which is its appeal. It is also blind to threat models that didn’t exist in 2003: linkage attacks against rare-disease cohorts, longitudinal pattern fingerprinting, and the simple fact that genomic sequences — inherently identifying — aren’t on the list at all. Researchers have noted that only about 7,000 people in the US have Huntington’s disease, which means a “de-identified” rare-disease dataset can be effectively re-identified by anyone holding a second dataset that overlaps on diagnosis and rough geography.
Expert Determination shifts the burden to a qualified statistician who certifies that the risk of re-identification is “very small” using documented methods. The Privacy Rule deliberately does not specify the expert’s credentials or sanction any particular technique, which gives flexibility but loads the work onto methodology and documentation. For ML training datasets that combine multiple sources, Expert Determination is generally the more defensible path, because Safe Harbor’s flat checklist underestimates how informative quasi-identifiers become when joined.
The Safe Harbor list assumes structured fields. Clinical notes don’t oblige. A discharge summary may name the patient’s spouse, the referring physician’s clinic, the rural hospital where they were stabilized, and the date of a car accident — none of which lives in a tidy patient_name column. Free-text de-identification requires either rules-based scrubbing tuned to clinical idioms, a named-entity recognition model, or both. Each approach has known false-negative rates that compound across millions of notes.
Where Training Pipelines Break HIPAA Even With De-Identified Data
Three failure modes show up repeatedly in ML pipelines that started with good de-identification intentions.
The first is memorization. Carlini and colleagues established in 2021 that LLMs can memorize sequences seen as few as a single time during training, and the effect scales log-linearly with model size — larger models memorize more, and memorize faster. A 2023 divergence attack against instruction-tuned models extracted up to 150× more verbatim training sequences than typical user queries. For a model fine-tuned on supposedly de-identified clinical notes, this means residual identifiers missed by the scrubber — a misspelled name, a nickname, a clinic abbreviation — can be coaxed back out by an adversary with a plausible prefix.
The second is fine-tuning leakage from input-only data. The TUM and Imperial College team’s 2025 work on Llama 3.2 1B showed that PII appearing only in inputs (not in supervised target outputs) still gets memorized and is recoverable through prefix-based extraction attacks. Pipelines that route raw patient queries into a fine-tuning corpus on the assumption that “we’re not training the model to repeat them” are operating on a faulty assumption.
The third is scope creep in BAAs. A standard Business Associate Agreement permits a vendor to use PHI to perform services for the covered entity. It does not, by default, permit the vendor to use that PHI to train commercial models. Updated BAAs covering AI relationships need explicit clauses prohibiting model training on PHI without written authorization, defining data retention windows, requiring deletion certificates, and passing the same restrictions down to any subprocessor — particularly cloud AI platforms. HHS guidance on AI vendor contracts now treats this as table stakes.
The 2025 Security Rule NPRM and Why It Matters for AI
On January 6, 2025, OCR published the first proposed overhaul of the HIPAA Security Rule in over a decade. The Notice of Proposed Rulemaking attracted more than 4,600 comments by the March 7 deadline and remains under review; whether and in what form a final rule emerges is unsettled, with industry groups pushing back on cost estimates of roughly $9 billion in first-year compliance.
For AI specifically, three proposed changes matter regardless of final form. First, the NPRM removes the distinction between “addressable” and “required” implementation specifications, making encryption of ePHI at rest and in transit effectively mandatory with limited exceptions — which closes a long-running ambiguity about training datasets parked in cloud object storage. Second, it requires a written technology asset inventory and network map covering every system that creates, receives, maintains, or transmits ePHI, including ML training infrastructure that many organizations had treated as out-of-scope research environments. Third, it expressly contemplates AI, quantum, and emerging technologies as in-scope for risk analysis, signaling that “we didn’t think the GPU cluster was a HIPAA system” will not survive an OCR audit.
The proposal also tightens business-associate verification: covered entities would need annual written verification from each business associate that technical safeguards are in place, certified by a subject-matter expert. Contingency-plan activation notices from business associates would be required within 24 hours.
Section 1557 of the Affordable Care Act adds a parallel obligation. OCR’s December 2024 letter and the Section 1557 final rule require organizations using AI in patient-care decision support to identify and mitigate discrimination risks across race, color, national origin, sex, age, and disability — effective May 1, 2025. This is not a HIPAA requirement, but it lands on the same governance teams and applies to many of the same models.
Practical Controls That Actually Reduce ML Privacy Risk
The minimum-necessary standard pushes against the ML instinct toward maximal data. Resolve the tension by deciding which features the model actually needs before extracting anything: a model predicting 30-day readmission probably doesn’t need the patient’s full address, only the three-digit ZIP, and probably doesn’t need free-text social history at all. Document the justification when an entire chart is necessary; OCR investigations frequently surface that organizations cannot say where ePHI lives in their systems, and ML pipelines are a known blind spot.
For training itself, deduplication measurably reduces memorization — duplicated sequences are far more likely to be regurgitated verbatim than singletons. Strip duplicates above a length threshold (commonly around 50 tokens) before training. Differential privacy during fine-tuning reduces canary extraction rates, though the privacy-utility tradeoff is real and strict privacy budgets degrade clinical task performance. Output filters at inference can catch verbatim training-data emissions before they reach the user, which is a defense against the divergence-attack class.
For BAAs covering AI vendors, the contract should explicitly: prohibit use of PHI for model training without separate written authorization, set data retention limits with deletion certification, require subject-matter-expert verification of technical safeguards, and pass all obligations through to subprocessors including cloud AI platforms. Generic SaaS BAA templates do not cover these cases; the language has to be added.
For asset inventory, treat every component touching ePHI as in-scope: feature stores, vector databases, embedding caches, prompt logs, model registries, evaluation datasets, and any third-party API the model calls. Many organizations discover during audit that their RAG implementation is sending unredacted clinical text to an embedding API whose terms reserve rights to use submitted data — a BAA violation if no BAA is in place, and a Privacy Rule violation regardless.
FAQ
Are model weights themselves PHI? Unsettled. HHS has not issued definitive guidance. The conservative reading: if a model trained on PHI can be induced to emit identifying training data, the weights carry residual PHI status and should be protected accordingly. Most enforcement-aware healthcare organizations treat models trained on identified data as PHI artifacts.
Does running inference on de-identified data trigger HIPAA? No, provided the de-identification met Safe Harbor or Expert Determination standards and you have no actual knowledge of residual identifiability. But user-supplied prompts at inference time can re-introduce PHI into the system; logging, caching, and feedback collection on those prompts is in-scope.
Can a covered entity train a model on its own PHI without a BAA? Yes — a covered entity can use its own PHI for healthcare operations, which under §164.501 includes quality assessment and improvement activities. Training a model to support clinical operations generally fits. Sharing that PHI with an outside ML vendor for training requires a BAA, and using it for research generally requires either authorization or an IRB waiver.
What about synthetic data? Synthetic data generated from PHI is not automatically de-identified. If the generator memorizes and reproduces real records, the synthetic output inherits PHI status. Treat the generation process itself as PHI handling, and run formal re-identification testing on the synthetic output before declaring it safe.
What Actually Changes Now
The HIPAA framework will not get rewritten for AI; it will get extended through enforcement and rule updates that bolt onto the existing structure. The 2025 Security Rule NPRM, even if it lands in modified form, signals where OCR is heading: explicit asset inventories that include ML infrastructure, mandatory encryption with no addressable workaround, and risk analyses that treat AI systems as first-class technology assets rather than research curiosities.
The hard problem isn’t the law — it’s the gap between a regulatory framework built around discrete uses and disclosures and a technical reality where data, weights, and outputs blur into each other across a training run. Organizations that close that gap operationally before OCR forces them to will avoid both the enforcement risk and the more durable harm: a model that quietly remembers a patient who never agreed to be remembered.






