Sci Data. 2015 Oct 27;2:150058. doi: 10.1038/sdata.2015.58. eCollection 2015.

Population genomic datasets describing the post-vaccine evolutionary epidemiology of Streptococcus pneumoniae.

Croucher NJ1, Finkelstein JA2, Pelton SI3, Parkhill J4, Bentley SD4, Lipsitch M5, Hanage WP5.

Author information


Streptococcus pneumoniae is common nasopharyngeal commensal bacterium and important human pathogen. Vaccines against a subset of pneumococcal antigenic diversity have reduced rates of disease, without changing the frequency of asymptomatic carriage, through altering the bacterial population structure. These changes can be studied in detail through using genome sequencing to characterise systematically-sampled collections of carried S. pneumoniae. This dataset consists of 616 annotated draft genomes of isolates collected from children during routine visits to primary care physicians in Massachusetts between 2001, shortly after the seven valent polysaccharide conjugate vaccine was introduced, and 2007. Also made available are a core genome alignment and phylogeny describing the overall population structure, clusters of orthologous protein sequences, software for inferring serotype from Illumina reads, and whole genome alignments for the analysis of closely-related sets of pneumococci. These data can be used to study both bacterial evolution and the epidemiology of a pathogen population under selection from vaccine-induced immunity.


Bacterial genetics; Genetic variation; Molecular evolution; Respiratory tract diseases

PMID: 26528397