TTF-1 status in early-stage lung adenocarcinoma is an independent predictor of relapse and survival superior to tumor grading

Objectives: Thyroid transcription factor 1 (TTF-1) is a well-established independent prognostic factor in lung adenocarcinoma (LUAD), irrespective of stage. This study aims to determine if TTF-1 ′ s prognostic impact is solely based on histomorphological differentiation (tumor grading) or if it independently relates to a biologically more aggressive phenotype. We analyzed a large bi-centric LUAD cohort to accurately assess TTF-1 ′ s prognostic value in relation to tumor grade. Patients and methods: We studied 447 patients with resected LUAD from major German lung cancer centers (Berlin and Cologne), correlating TTF-1 status and grading with clinical, pathologic, and molecular data, alongside patient outcomes. TTF-1 ′ s impact was evaluated through univariate and multivariate Cox regression.

Objectives: Thyroid transcription factor 1 (TTF-1) is a well-established independent prognostic factor in lung adenocarcinoma (LUAD), irrespective of stage.This study aims to determine if TTF-1′s prognostic impact is solely based on histomorphological differentiation (tumor grading) or if it independently relates to a biologically more aggressive phenotype.We analyzed a large bi-centric LUAD cohort to accurately assess TTF-1′s prognostic value in relation to tumor grade.Patients and methods: We studied 447 patients with resected LUAD from major German lung cancer centers (Berlin and Cologne), correlating TTF-1 status and grading with clinical, pathologic, and molecular data, alongside patient outcomes.TTF-1′s impact was evaluated through univariate and multivariate Cox regression.

Introduction
After a steady increase in cases over the last decades, lung adenocarcinoma (LUAD) now accounts for more than 50% of all lung cancers [1,2].Disease-related prognostic factors include performance status, sex, tumor stage, targetable molecular alterations, and the tumor grade, classified according to the predominant tumor growth pattern [3][4][5][6][7].
The thyroid transcription factor 1 (TTF-1), also known as NK2 homeobox 1 (NKX2-1), is a 38 kDa nuclear protein encoded by the NKX2-1 gene [8], and physiologically expressed by surfactant-producing type 2 pneumocytes and Club cells, formerly known as Clara cells, in the lung.TTF-1 is expressed in approximately 70-80% of LUAD and is routinely used to distinguish LUAD from metastases from extrapulmonary adenocarcinomas [9,10].Having excluded metastases from extrapulmonary adenocarcinomas, the 20-25% of TTF-1 negative carcinomas are nevertheless classified as LUAD.A negative TTF-1 status is associated with an unfavorable cancer phenotype, including a lack of actionable genomic alterations, a reduced performance status, and a higher tumor burden [11][12][13].Thus, TTF-1 expression can be a prognostic factor in early-stage lung cancer as well as metastatic disease [11,[14][15][16][17].However, it remains to be clarified whether a negative TTF-1 status represents a unique LUAD subtype characterized by higher biological aggressiveness or whether it can be predominantly explained by poor histomorphological differentiation (higher tumor grade).
To address this gap and improve prognosis evaluation, we conducted a causal effect estimation in a large bi-centric cohort of 447 patients with resected LUAD to evaluate the effect sizes and mutual moderation concerning TTF-1-status, grading, and several tumors-and patientspecific parameters.

Patient cohort
447 treatment-naïve patients from two German large-volume lung cancer centers (Department of General, Visceral, Vascular and Thoracic Surgery, Charité-Universitätsmedizin Berlin, Germany; Department of Cardiothoracic Surgery, University Hospital Cologne, Germany) having received surgery for stage I-III LUAD between 2009 and 2019 were included into this retrospective study.The reviewed diagnosis was based on the current WHO classification criteria for lung cancer, and grading was determined following recommendations from IASLC/ATS/ERS [18].Only patients with complete resection (R0) were included, whereas distinct LUAD variants such as mucinous, enteric, colloid, or fetal cancers were excluded, as suggested by IASLC/ATS/ERS [18,19].Baseline patient demographic data and cancer-related characteristics such as pTNM classification, tumor stage, genomic alterations, type of surgery, and adjuvant therapy were collected.The study was performed according to the ethical principles for medical research of the Declaration of Helsinki and was approved by the Ethics Committee of the Charité University Medical Department in Berlin (EA4/243/21).

Histologic evaluation
Hematoxylin and eosin, periodic acid-Schiff, and Elastica-van-Gieson stained slides, as well as immunohistochemical staining of formalinfixed and paraffin-embedded LUAD specimens, were reviewed by an experienced pathologist (S.S.) for diagnosis, grading, pTNM classification, angioinvasion, lymphatic invasion, and tumor stage according to the 8th edition of the TNM classification (AJCC).Uncertainties were discussed with other pathologists (A.Q., M.P.D., F.K.).TTF-1 immunohistochemistry was performed for primary diagnosis using the same antibody at both institutions (8G7G3/1, Dako, 1:100).Specimens containing artifacts and heavily bleached were newly cut, stained, and reassessed.Positive TTF-1 staining was defined as any positive nuclear staining in the tumor cells (see Supplementary Figure 1 for exemplary staining and an overview of the respective tumor grades).

Statistics
Demographics and disease data were described and compared using the Chi2-test.Categorical outcomes for differences in the TTF-1 positive and negative subgroups were assessed in univariate and multivariate Cox regression.Survival outcome was evaluated with the Kaplan-Meier estimator, and follow-up time was calculated with the reverse Kaplan-Meier method.Disease-free survival (DFS) and overall survival (OS) were defined as the time in months from surgery to histologically confirmed or radiologically detected relapse or death from any cause, whichever occurred first.Confidence intervals of the Cox regression models were calculated using bootstrap resampling with n = 1000 iterations, including a missing value imputation from the respective variable's marginal distribution within each bootstrap sampling iteration.P-values were derived from the bootstrap mean and standard deviation of the regression coefficients, which we take as an estimate of the standard error.All analyses were performed with Python using pandas, scipy, seaborn, and lifelines packages.The code is available on 'github.gabrieldernbach….'.If not otherwise specified, values are given as median with a 95% confidence interval (95% CI).

Causal effect estimation
In randomized controlled trials, statistical confounding in the estimation of treatment effects is excluded by the study design.In contrast, the present study is observational, and possible confounding variables must be considered when estimating statistical effects.To identify these confounding variables, the interactions of known relevant variables were evaluated using a causal Bayesian network to derive the necessary adjustment sets for a reliable estimation of the total causal effects [20].The respective adjustment sets for a given effect of interest were used as additional predictors in multivariate Cox regression.

Association of TTF-1-status and grading with clinicopathologic parameters and DFS
Classification of the 447 patients resulted in 397 TTF-1-positive (88.8%) and 50 TTF-1-negative (11.2%)LUAD samples.The distribution of patient demographic and clinicopathologic characteristics by TTF-1 status is provided in Table 1a.The mutation status of patients according to TTF-1 status is shown in Table 1b.
First, we analyzed the association between TTF-1 status and grading.TTF-1 positive LUADs mainly showed a low-or intermediate tumor grade (70.3% vs. 50%), whereas TTF-1 negative LUADs were more frequently high-grade tumors (50% vs. 29.7%)(Fig. 1A).Second, to examine the association of grading and TTF-1 status in their association with DFS, we compared Kaplan-Meier curves of the different TTF-1 status within grading subgroups (Fig. 1B-D) and vice versa (Fig. 1E-F).In a direct comparison of TTF-1 and grading, a positive TTF-1 status was associated with longer DFS independent of grading (Fig. 1B-D).
Conversely, a strong association of DFS with grading was observed only in TTF-1-positive patients (Fig. 1E-F).

Table 1a
Distribution of patients' demographic and clinicopathological characteristics according to the TTF-1 status and tumor grade.Abbreviations: TTF-1, Thyroid transcription factor 1; ADCs, adenocarcinomas; G1, grade 1; G2, grade 2; G3, grade 3; ECOG, Eastern Cooperative Oncology Group Performance Status Scale; NA, not available; UICC, International Union Against Cancer.P-value < 0.05 was considered significant and was flagged with one star (*).Pvalue < 0.01 was flagged with two stars (**).# Percentage of tumor grades refers to each individual characteristic (the total of 100% is per row).

Comparison of the respective total causal effect of TTF-1 and grading
For a causally informed analysis of the clinicopathological variable interactions, we performed a Bayesian network analysis (Fig. 3A).We listed all known related clinical variables and arranged them by their expert-believed interactions in a graph.Variables (graph nodes) were connected (graph edge) if they were known to affect each other in a causal fashion, additionally their direction of interaction was depicted.The graph can then be evaluated to identify variable pathways that inform how an intervention on one of the variables affects other downstream variables.Knowledge about the variable pathways helps us select the exact set of variables necessary to estimate the unique contribution of an exposure variable on an outcome variable.Using causally informed Bayesian networks to determine adjustment sets provides a structured and principled method for covariate selection, mitigating risks such as collider bias that can arise from naive confounding control in conventional multivariate analyses.Using the network, we identified sex as a necessary adjustment set for estimating the total causal effect of TTF-1 positivity on DFS.The corresponding Cox regression showed a significant hazard ratio (HR − 0.86 [− 1.25; − 0.41]).A similar network analysis for estimating the total effect of grading on DFS indicated the necessary adjustment set of neoadjuvant therapy and TTF-1.The corresponding Cox regression demonstrated no significant effect on DFS (median log HR 0.31 [− 0.32; 1.30]/ 0.61 [− 0.07; 1.65]) (Fig. 3B).Finally, for the fitted model, we show what the survival curves look like when we vary a single covariate and keep everything else the same.This is useful for understanding the effect of a covariate in a particular model.In Fig. 3C we simulate such a comparison by varying the covariate TTF-1 (Fig. 3C) to the effect of varying the covariate grading (Fig. 3D).The impact of varying TTF-1 is superior in absolute effect to the impact of varying grading, indicating that TTF-1 is a more relevant prognostic parameter than tumor grading.

Discussion
In this retrospective cohort analysis of 447 patients with fully resected, early-stage LUAD (stage I-III), we investigated the prognostic value of TTF-1 expression and found it to be independent of grading.Moreover, our study determined the prognostic contributions of TTF-1 and grading and could show that TTF-1 has a larger total effect on patient outcome than grade.
In LUAD, the predominant histological growth pattern has a clear prognostic impact on OS and DFS, independent of age, sex, tumor stage, and treatment [4,[21][22][23], which has led to the development of a three-tiered histological grading system for LUAD [18].The latter aligns with our results that tumor grade predicted the DFS in the whole cohort (Fig. S2Y).However, our analyses on a large bi-centric cohort have shown that grading only has predictive value for TTF-1 positive patients, whereas no effect was observed in TTF-1 negative samples.Moreover, the validity of the tumor grading is limited after neoadjuvant treatment due to potential changes in the growth pattern that could lead to misclassification.
TTF-1 status has been reported to correlate with the Karnofsky Performance Status, sex, smoking status, as well as distinct molecular alterations like sensitizing EGFR mutations in previous studies [11,12].In contrast, our cohort did not show any statistically significant association between TTF-1 and these parameters (Fig. S2V).Furthermore, several studies described TTF-1 as a prognostic indicator for patients' survival in early and late-stage LUAD [11,14,15,17,24].This is consistent with our results, where a positive TTF-1 status is correlated with a significantly longer DFS in univariate analysis.In addition to previous studies, we identified TTF-1 as a prognostic indicator of patient outcome

TTF-1 positive ADCs TTF-1 negative ADCs
S. Schallenberg et al. independent of grading in multivariate analysis.Finally, to measure the effect of TTF-1 in the context of known prognostic parameters, particularly tumor grading, we performed a causally informed effect estimation, for the first time, showing that the impact of varying TTF-1 is superior to that of varying grading.Performing the Bayesian network analysis critically controls for possible confounding factors, which otherwise often remains under-discussed.
It may help to compare the network analysis approach to the popular practice of multivariate analysis in which an analyst would first score each variable individually for its statistical link with the outcome and then use the variables that met a significance threshold for the multivariate model.There are two problems with this approach that Bayesian network analysis helps us address.First, the prefiltering step could mistakenly exclude some variables that could be relevant in combination and would not be included in the multivariate model.Second, the multivariate model may include numerous symptomatic variables that together overrule the effect of the actual effector variable that caused them.In theory, it is possible to hide the association of a causal factor with the outcome by providing countless related symptomatic variables of that causal factor.For example, if the correlation of an aggressive molecular NSCLC genotype (causal factor), as well as clinical tumor symptoms (related symptomatic variables of an aggressive molecular NSCLC genotype) with the survival time of the patients (outcome), were examined simultaneously, the clinical tumor symptoms would obscure or reduce the true effect size of the aggressive molecular NSCLC genotype in a multivariate analysis.
With Bayesian network analysis, one can establish the cause-andeffect relationships among variables a priori and employ the resulting network to choose the necessary set of variables in the regression.If implemented accurately, this method prevents obscure and misleading regression outcomes but introduces dependence on the analyst's judgment of the causal connections between the variables.The predictive Fig. 2. Effect of TTF-1, tumor grade, and clinicopathologic parameters on DFS by univariate and multivariate analyses: (A) Univariate analysis and (B) multivariate analysis of TTF-1 status (switch from negative to positive), tumor grade (switch from grade 1 to grade 2 and grade 2 to grade 3 respectively), sex (switch from female to male), UICC stage (switch from UICC stage 1 to UICC stage 2 and UICC stage 2 to UICC stage 3 respectively), smoking status, pack-years, lymphangioinvasion (switch from L0 to L1), hemangioinvasion (switch from V0 to V1) and age at diagnosis regarding DFS.Hazard ratios are log scaled for better readability.power of a Bayesian network highly depends on the accuracy of its design.Therefore, in the appendix, we have included alternative instances of the Bayesian network that we consider having an equal likelihood of being true.We find that discrepancies between these networks have no impact on variable selection in our analysis and presented results, so the equally likely networks show the same significant effects.We believe that this is an important step for standard medical analysis efforts as it facilitates constructive discussions around variable selection.While dependent on prior knowledge, the proposed Bayesian network allows others to adjust and refine estimations by incorporating their domain knowledge and experimental findings.Ultimately, a Bayesian network analysis is a modeling tool that allows for incremental improvement.

Limitations
This retrospective study has some limitations.First, the relatively small number of TTF-1 negative cases might have mimicked the effects of grading in this population.Second, the setup of the Bayesian network followed the investigators' discretion, thus being prone to selection bias and resulting in potentially missing confounders.

Conclusion
This study underscores the independent prognostic value of TTF-1 expression for tumor relapse and survival in LUAD regardless of tumor grade.Moreover, our analysis indicates that the prognostic power of tumor grading is limited to TTF-1-positive patients.Furthermore, our findings reveal that the effect size of TTF-1 surpasses that of tumor grading.Thus, TTF-1 may serve as a rapidly evaluable, practical, costeffective discriminator for patient prognosis in LUAD, which, in contrast to the grading, is also usable after neoadjuvant therapy.To translate the findings into the pathology report, we recommend distinguishing between TTF-1-positive and TTF-1-negative LUADs in tumor coding.Tumor grading should only be applied to TTF-1-positive LUADs (TTF-1 + / G1-3).TTF-1-negative LUADs should either not be graded or always be classified as high grade (TTF-1-/ G3).

Fig. 1 .
Fig. 1.Association of TTF-1-status and tumor grade with DFS: (A) The histogram shows the share of TTF-1 status per tumor grading.(B-D) DFS Kaplan-Meier curves showing a comparison of TTF-1 status for the different tumor grades and (E, F) vice versa.The numbers in the upper right corner indicate the median survival time in months if an event was observed in more than 50% of cases (otherwise = inf).SubFig.F shows that further differentiation of patients by tumor grade has no effect once it is known that a patient is TTF-1 negative.The reverse is not true (subfigure B-D), where TTF-1 status remains informative even if we initially stratify by tumor grade.

Fig. 3 .
Fig. 3. Evaluation of the true effect size of TTF-1 on DFS and comparison with grading by causal effect estimation: (A).Bayesian network modeling the conditional dependency among variables in a directed acyclic graph.The mutational profile, including TTF-1 status, is defined as exposure (input) and DFS as outcome, respectively.Green circles represent ancestors of exposure, light gray circles ancestors of outcome, red circles ancestors of exposure and outcome, yellow circles represent other variables, green arrows causal pathways, and red arrows biasing pathways respectively.(B) Forrest plot showing the computed effects of TTF-1 status and tumor grade on DFS.Computed survival curves showing the effects of (C) TTF-1 and (D) tumor grading.