Background: Non-invasive phenotyping of chronic respiratory diseases would be highly beneficial in the personalised medicine of the future. Volatile organic compounds can be measured in the exhaled breath and may be produced or altered by disease processes. We investigated whether distinct patterns of these compounds were present in chronic obstructive pulmonary disease (COPD) and clinically relevant disease phenotypes.Methods: Breath samples from 39 COPD subjects and 32 healthy controls were collected and analysed using gas chromatography time-of-flight mass spectrometry. Subjects with COPD also underwent sputum induction. Discriminatory compounds were identified by univariate logistic regression followed by multivariate analysis: 1. principal component analysis; 2. multivariate logistic regression; 3. receiver operating characteristic (ROC) analysis.Results: Comparing COPD versus healthy controls, principal component analysis clustered the 20 best-discriminating compounds into four components explaining 71% of the variance. Multivariate logistic regression constructed an optimised model using two components with an accuracy of 69%. The model had 85% sensitivity, 50% specificity and ROC area under the curve of 0.74. Analysis of COPD subgroups showed the method could classify COPD subjects with far greater accuracy. Models were constructed which classified subjects with ≥2% sputum eosinophilia with ROC area under the curve of 0.94 and those having frequent exacerbations 0.95. Potential biomarkers correlated to clinical variables were identified in each subgroup.Conclusion: The exhaled breath volatile organic compound profile discriminated between COPD and healthy controls and identified clinically relevant COPD subgroups. If these findings are validated in prospective cohorts, they may have diagnostic and management value in this disease. © 2012 Basanta et al.; licensee BioMed Central Ltd.