Biomarkers predict World Trade Center-Lung Injury (WTC-LI); however, there remains unaddressed multicollinearity in our serum cytokines, chemokines, and high-throughput platform datasets used to phenotype WTC-disease. To address this concern, we used automated, machine-learning, high-dimensional data pruning, and validated identified biomarkers. The parent cohort consisted of male, never-smoking firefighters with WTC-LI (FEV1, %Pred< lower limit of normal (LLN); n = 100) and controls (n = 127) and had their biomarkers assessed. Cases and controls (n = 15/group) underwent untargeted metabolomics, then feature selection performed on metabolites, cytokines, chemokines, and clinical data. Cytokines, chemokines, and clinical biomarkers were validated in the non-overlapping parent-cohort via binary logistic regression with 5-fold cross validation. Random forests of metabolites (n = 580), clinical biomarkers (n = 5), and previously assayed cytokines, chemokines (n = 106) identified that the top 5% of biomarkers important to class separation included pigment epithelium-derived factor (PEDF), macrophage derived chemokine (MDC), systolic blood pressure, macrophage inflammatory protein-4 (MIP-4), growth-regulated oncogene protein (GRO), monocyte chemoattractant protein-1 (MCP-1), apolipoprotein-AII (Apo-AII), cell membrane metabolites (sphingolipids, phospholipids), and branched-chain amino acids. Validated models via confounder-adjusted (age on 9/11, BMI, exposure, and pre-9/11 FEV1, %Pred) binary logistic regression had AUCROC [0.90(0.84–0.96)]. Decreased PEDF and MIP-4, and increased Apo-AII were associated with increased odds of WTC-LI. Increased GRO, MCP-1, and simultaneously decreased MDC were associated with decreased odds of WTC-LI. In conclusion, automated data pruning identified novel WTC-LI biomarkers; performance was validated in an independent cohort. One biomarker—PEDF, an antiangiogenic agent—is a novel, predictive biomarker of particulate-matter-related lung disease. Other biomarkers—GRO, MCP-1, MDC, MIP-4—reveal immune cell involvement in WTC-LI pathogenesis. Findings of our automated biomarker identification warrant further investigation into these potential pharmacotherapy targets., Author summary Disease related to air pollution causes millions of deaths annually. Large swathes of the general population, as well as certain occupations such as 1st responders and military personnel, are exposed to particulate matter (PM)—a major component of air pollution. Our longitudinal cohort of FDNY firefighters exposed to the World Trade Center dust cloud on 9/11 is a unique research opportunity to characterize the impact of a single, intense PM exposure by looking at pre- and post-exposure phenotype; however, PM-related lung disease and PM’s systemic effects are complex and call for a systems biological approach coupled with novel computational modelling techniques to fully understand pathogenesis. In the present study, we integrate clinical and environmental biomarkers with the serum metabolome, cytokines, and chemokines to develop a model for early disease detection and identification of potential signaling cascades of PM-related chronic lung disease.