7.5 Closing: The Pre-Specified Adaptation

What this chapter asked

Chapter 7 asked one question: what if the design itself adapts?

It asked that question not as an introduction to a technical specialty, but as an application of the book’s central theme. The risks established in every prior chapter—optimistic effect size assumptions, fragile nuisance parameters, estimand ambiguity, interim governance failures, claim structure inflation—do not disappear when a design adapts. They are amplified, because the adaptation introduces new decision points, new information flows, and new interactions between design features that fixed designs manage separately.

Section 7.1 examined sample size re-estimation as the tool for correcting the nuisance parameter misspecifications that Chapter 3 identified. Nuisance-parameter SSR—based on blinded data—corrects the most common power failure without inflating the type I error. Effect-size SSR—based on unblinded data—corrects a broader range of failures but requires combination test methods to control the type I error and introduces governance requirements that parallel the Chapter 4 interim analysis framework.

Section 7.2 examined adaptive enrichment as a design that changes the enrolled population mid-trial based on the interim biomarker subgroup result. Enrichment changes the population attribute of the estimand and requires pre-specification of the combined estimand across pre- and post-adaptation phases, the enrichment rule including the biomarker assay specification, and the governance structure for the enrichment decision.

Section 7.3 identified estimand shift as the most conceptually underappreciated risk in adaptive design: the ways in which adaptations change what the trial is estimating even when the nominal endpoint and analysis model remain unchanged. Explicit shifts—population, treatment attribute, intercurrent event strategy—are visible. Implicit shifts—the conditional structure introduced by effect-size SSR, the dose selection conditioning in seamless designs—are harder to recognize and equally consequential.

Section 7.4 examined adaptive NI design as the intersection of two demanding frameworks, each of which amplifies the other’s risks. The constancy assumption—the load-bearing scientific justification for the NI margin—must be re-evaluated at each adaptation point, using pre-specified criteria, by parties independent of the unblinded interim data.


What this chapter decided

By the end of this chapter, a trial team proposing an adaptive design must have answered four questions with documented, pre-specified answers.

What, exactly, is adapting? The specific feature—sample size, population, dose, analysis—is named. The pre-adaptation value and the post-adaptation value under each adaptation scenario are specified. The boundary between what adapts and what does not is drawn explicitly, so that a reviewer can verify that features claimed to be fixed were not modified in response to the interim data.

What information triggers the adaptation, and who handles it? For nuisance-parameter SSR: the blinded pooled statistics and the independent statistician who applies the rule. For effect-size SSR or adaptive enrichment: the unblinded interim data, the IDMC that reviews it, the firewall that prevents the unblinded result from reaching the sponsor, and the communication protocol that conveys the adaptation decision without conveying the interim treatment effect. For seamless designs: the phase II efficacy and safety data, the selection rule, the parties involved in the selection, and the information that each party receives.

What does the adaptation do to the estimand? The explicit estimand shifts are identified: which of the four attributes changes at the adaptation, and what the post-adaptation values are. The implicit estimand shifts are identified: how the combination test method handles the conditional structure introduced by the adaptation, and what the primary analysis is estimating as a result. The combined estimand is specified for the pooled population.

What are the operating characteristics of the adapted design? The simulation covers the full scenario space: true effect sizes above, at, and below the assumed alternative; nuisance parameters at assumed and pessimistic values; various adaptation outcomes under each scenario. The type I error under the combination test method is verified at the nominal level. The power under the target alternative is established. The expected sample size—or expected event count—under the null and alternative is documented.


The characteristic mistakes of this chapter

Three failures define the characteristic risk profile of adaptive designs.

The adaptation triggered by the wrong information. The SSR was designed to use blinded nuisance parameters. At the re-estimation time point, the independent statistician had access not only to the blinded pooled variance but to the interim clinical data report—a document prepared for the DSMB that included arm-specific event summaries. The re-estimation was completed by the independent statistician before the DSMB meeting, and the new sample size was communicated to the sponsor. The sponsor did not receive the arm-specific event data, but the independent statistician who performed the re-estimation did—and the re-estimation was no longer based purely on nuisance parameters. The governance firewall had a gap that was not identified until the regulatory inspection examined the independent statistician’s data access log.

The enrichment whose estimand was not pre-specified. The trial pre-specified an enrichment rule—enrich toward biomarker-positive patients if the interim biomarker subgroup shows substantially stronger benefit. The enrichment was triggered at the first interim. The primary analysis framework for the combined pre- and post-adaptation population had not been pre-specified in the SAP; the SAP addressed the unadapted design only. After the enrichment, the analysis team proposed a weighted combination test using the pre- and post-adaptation patient data. The regulatory reviewer asked when the combination weights and the combined estimand were specified. The answer—after the enrichment decision—was the answer that made the primary analysis post-hoc.

The NI adaptive trial whose constancy assumption was not re-evaluated. The trial was an NI design with a pre-specified SSR. The SSR triggered an extension of the follow-up duration from 24 to 36 months, because the control arm event rate was lower than assumed. The NI margin had been derived from historical trials with 18-24 months of follow-up. The primary analysis used the pre-specified margin, without re-evaluating whether the constancy assumption remained credible at 36 months. The trial concluded non-inferiority. The regulatory reviewer noted that the margin had been derived from trials with substantially shorter follow-up than the adapted trial’s final duration, and that the constancy assumption had not been evaluated for the extended duration. The NI conclusion was challenged, and the sponsor was required to provide a post-hoc analysis of whether the constancy assumption was credible at 36 months. The analysis was inconclusive. The label was approved with a note about the limitation.

Each of these failures has the same structure: a pre-specification that was incomplete in a specific way, and an adaptation that operated in the gap left by the incompleteness. The pre-specification was not fraudulent—the incomplete element was not a deliberate omission. It was a gap that was not recognized as a gap until the adaptation was triggered and the gap became consequential.


What cannot be recovered

A governance breach in an SSR—unblinded data reaching parties who should not have seen it—cannot be undone. The information, once accessed, cannot be erased from the knowledge of the parties who accessed it. The adaptation decision that followed may have been made on the same basis as the pre-specified rule would have required, but the process by which it was made is contaminated, and the contamination is not correctable by analysis.

A combined estimand that was not pre-specified—that was constructed after the adaptation to fit the adapted population—cannot be retroactively pre-specified. The combination weights, the analysis method, and the decision about which patients to include in the primary analysis were all made with knowledge of which patients were in the pre-adaptation cohort and which were in the post-adaptation cohort. The analysis is post-hoc even if the analysis plan was written before the final database lock.

An NI margin that was not validated for the adapted population cannot be retroactively validated. The constancy assumption for the adapted population requires historical data that may not exist—because the historical trials did not enroll the enriched population or run to the extended duration. A post-hoc analysis of the constancy assumption is an analysis that was not required at design and is now being conducted to support a conclusion that was already made. The analysis may be informative, but it is not the evaluation that the pre-specification requirement demands.

These irrecoverabilities define the professional risk of adaptive design. The risk is not that adaptive designs are invalid—they are valid when pre-specified correctly. The risk is that the pre-specification of an adaptive design is more demanding than the pre-specification of a fixed design, and the gaps in the pre-specification are not always visible until the adaptation is triggered and the gap becomes a problem.


The connection to Chapter 8

Chapter 8 asks what must be locked, and when. It is the chapter that closes the book’s design sequence by examining what the pre-specification requirement means in practice—what documents must be finalized before what events, by whom, with what review and approval process.

The connection to Chapter 7 is direct. Adaptive designs require more to be locked, earlier, and with more specificity than fixed designs. The SSR rule must be in the SAP before the re-estimation time point. The enrichment rule—including the biomarker assay, the enrichment threshold, and the combined estimand framework—must be in the protocol before the first patient is enrolled. The constancy re-evaluation criteria for adaptive NI designs must be in the protocol before enrollment. The combination test method and the critical values for the adaptive primary analysis must be in the SAP before the first interim.

Each of these lock requirements is the adaptive extension of a fixed design requirement. The fixed design locks the sample size at design; the adaptive design locks the re-estimation rule at design. The fixed design locks the estimand at design; the adaptive design locks the pre- and post-adaptation estimands, and the combined estimand, at design. Chapter 8 will establish the lock requirements for the full design—fixed and adaptive—as a governance discipline that makes the pre-specification requirement operational.


Chapter 7 risk summary

The decision this chapter owns: is the planned adaptation pre-specified in sufficient detail to be implemented without discretion, governed with sufficient rigor to prevent information from crossing the boundaries the rule requires, and documented in sufficient specificity to be audited against the pre-specification?

The most common mistake: treating the adaptation as a design improvement without treating the pre-specification of the adaptation as a design requirement. The adaptation is specified in the protocol—SSR will be conducted, enrichment is planned—but the details that determine whether the adaptation controls the type I error, maintains the estimand, and is governable without contamination are left to the SAP or to the IDMC charter, which are completed later and with less scrutiny than the protocol.

The professional-level risk: the adaptive design that was pre-specified at the level of intent and implemented at the level of discretion. The SSR was triggered when the independent statistician judged the nuisance parameters to be sufficiently pessimistic—not when a specific threshold was crossed. The enrichment was recommended by the IDMC based on their clinical judgment about the biomarker subgroup result—not based on a pre-specified threshold. The combined estimand was constructed by the analysis team to be as favorable as possible to the adapted population—not selected from pre-specified options. Each individual decision may have been made in good faith. The cumulative effect is a trial whose adaptations were not pre-specified in the sense the regulatory agencies require—and whose primary result, however valid statistically, cannot be verified as the product of the pre-specified design rather than of the data that were seen.