Supplemental Information to:
Metagenomic and metaproteomic analyses of Accumulibacter phosphatis enriched floccular and granular biofilm
Jeremy J. Barr1,2,3*, Bas E. Dutilh4,5,6, Connor T. Skennerton2,7,10, Toshikazu Fukushima2,8, Marcus L. Hastie9, Jeffrey J. Gorman9, Gene W. Tyson2,7, Philip L. Bond2,3
1 San Diego State University, Department of Biology, San Diego, CA 92182, USA
2 The University of Queensland, Advanced Water Management Centre (AWMC), QLD 4072, Australia
3 Environmental Biotechnology Cooperative Research Centre (EBCRC), Sydney, Australia
4 Theoretical Biology and Bioinformatics, Utrecht University, Padualaan 8, 3584 CH, Utrecht, The Netherlands
5 Centre for Molecular and Biomedical Informatics, Radboud Institute for Molecular Life Sciences, Radboud University Medical Centre, Greet Grooteplein 28, 6525 GA, Nijmegen, The Netherlands
6 Department of Marine Biology, Institute of Biology, Federal University of Rio de Janeiro, Brazil
7 The University of Queensland, Australian Centre for Ecogenomics, School of Chemistry and Molecular Bioscience, QLD 4072, Australia
8 Division of Environmental Studies, Graduate School of Frontier Sciences, The University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa, Chiba 277-8563, Japan
9 Protein Discovery Centre, Queensland Institute of Medical Research (QIMR) Berghofer Medical Research Institute, Herston, QLD 4006, Australia
10 Division of Geological and Planetary Sciences, California Institute of Technology, Pasadena, CA 91125, USA
*Corresponding author.
Jeremy J. Barr
Department of Biology,
San Diego State University,
San Diego, CA 92182, USA
Supplemental Information – Methods.
Metagenome assembly, and genome binning
The raw sequencing data was imported into CLC genomics workbench 5 (www.clcbio.com) and trimmed using a quality threshold of 0.05 and assembled using the CLC de novo assembler, using the default settings. Metagenome assembly resulted in 117,544 and 152,528 contigs greater than 200 bp for the Floc and Gran samples, respectively. Contigs were identified as putatively originating from Accumulibacter if they contained blast alignment lengths greater than 2 Kbp to Candidatus Accumulibacter phosphatis (NCBI id: NC_013194). These putative contigs were further binned based on their coverage in the datasets. Raw metagenomic sequencing reads were aligned to contigs using BWA 0.6.2 (Li & Durbin, 2009), and average coverage determined by a custom script (https://github.com/ctSkennerton/scriptShed/blob/master/average_coverage.rb). Contigs that formed cohesive bands of coverage were segregated to remove erroneous blast matches and to define individual Accumulibacter draft genomes. Contigs greater than 5 Kbp that were not defined as Accumulibacter were binned using tetranucleotide frequencies (Dick et al., 2009). Briefly, tetranucleotide frequencies were calculated using a window size of 5 Kbp (http://rubygems.org/gems/bio-kmer_counter), and frequencies were z-normalized using Databioic ESOM (http://databionic-esom.sourceforge.net/). ESOM classifications were performed using the methods of (Dick et al., 2009) to generate genome bins. Community composition of the raw metagenomic data was estimated using kraken 0.10.4 (Wood & Salzberg, 2014) using the NCBI NT (non-redundant nucleotide) database.
Protein extraction and purification
Both total cellular and extracellular proteins were extracted, using the B-PER extraction method and cation exchange resin (CER), respectively. For total cellular extracts, sludge was resuspended at 10 gVSS/L in 50 mM Tris-HCl at pH 7, and 1 mL of sample was transferred to a micro-centrifuge tube followed by centrifugation at 15,000 g for 15 min at 4 ºC. Supernatant was discarded and the sample pellet was resuspended in 1 mL of B-PER extraction buffer (77 mg of dithiothreitol (DTT), 1 tablet of Complete Mini protease inhibitor, 10 mL of B-PER reagent). Samples were placed at -80 ºC for 1 hr to freeze, thawed on ice, and repeated for 3 cycles. Samples were centrifuged at 15,000 g for 15 min at 4 ºC to remove cell debris and supernatant was stored. For extracellular extractions, samples were resuspended at 10 gVSS/L in 30 mL of 2 mM Na3PO4, 4 mM NaH2PO4, 9 mM NaCl, 1 mM KCl, pH 7.0 at 4 ºC. Resuspended sample was then added to 21 g of Dowex-Marathon CER (Sigma-Aldrich, St. Louis, MO, USA), 231 mg DTT and 3 tablets of Complete Mini within an extraction vessel that was stirred at 700 rpm for 5 hr at 4 ºC. Following extraction, 2 ml of sample was transferred to a micro-centrifuge tube followed by centrifugation at 15,000 g for 15 min at 4 ºC to remove cell debris and supernatant was stored.
Proteomic analysis
For proteomic analysis 25 µg of TCA precipitated protein was resuspended with 7 M urea, 2 M thiourea, 4% CHAPS, 2% SDS and 50 mM Tris-HCl at pH 8.8, reduced with 20 mM DTT, and alkylated with 50 mM iodoacetamide. Samples were then co-precipitated with 1 µg modified sequencing grade trypsin (Roche, USA), with the addition of 9 volumes 100% methanol at -20 ºC for 16 hr. Samples were centrifuged at 16,000 g at 4 ºC for 5 min and the protein pellet washed with 1 ml 90% methanol at -20 ºC followed by centrifugation. The protein pellet was resuspended in 50 µl of 40 mM ammonium bicarbonate, 10% acetonitrile and incubated at 37 ºC for 2 hr. A further 0.6 µg of trypsin was added and incubated for another 4 hr at 37 ºC. The sample was dried down to < 10 µl (but not to dryness) in a vacuum centrifuge and then made up to 25 µl with 1% formic acid. Tryptic digests were subjected to CapHPLC-MS/MS analysis using an UltiMate 3000 HPLC system (Dionex) interfaced with a linear ion trap (LTQ)-Orbitrap XL hybrid mass spectrometer (Thermo Fisher Scientific, Bremen, Germany), equipped with a dynamic nanoelectrospray source (Proxeon, Odense, Denmark), controlled by Xcalibur 2.0 SR1 (Thermo Electron, Bremen, Germany) with DCMS link chromatography interface (Dionex). Protein digests (5 µl) were injected onto a 300 Å, 300 µm x 5 mm C18 trap column (Dionex Acclaim PepMap™ precolumn) at 20 µl/min in 98% solvent A (0.1% (v/v) aqueous formic acid) and 2% solvent B (80% (v/v) ACN/20% (v/v) H2O containing 0.1% (v/v) formic acid) for 5 min at 40 °C and subsequently back flushed onto a pre-equilibrated analytical column (Vydac Everest C18 column (150 mm x 150 µm) at a flow rate of 1 µl/min, gradient to 45% B over 75 min; to 75% B over 15 min; and, to 100% B over 5 min and then holding the column at 100% B for 10 min (40oC). Column eluant was introduced through uncoated silica emitters (30 µm, New Objective, NJ, USA), typical spray voltage 1.5 kV no sheath, sweep, or auxiliary gases were used. The heated capillary temperature was set to 200 °C. External mass calibration was performed with the ESI-LTQ-Orbitrap Tune mix solution in the positive ion mode for the LTQ and Orbitrap. In each data dependent acquisition cycle, one FTMS survey scan was acquired in the Orbitrap mass analyser at 60,000 resolution, 5 x 105 automatic gain control (AGC) target value and 500 ms maximum ion accumulation time. This was followed by eight data-dependent MS/MS scans (CID) on the most abundant ions using the linear ion trap (LTQ) with an AGC target value of 1 x 104 and 100 ms maximum accumulation time. Minimum signal 1000, ion isolation width of 3 da with 30 ms activation at 35% normalized collision energy were used for isolation and fragmentation of precursor ions in the LTQ. Dynamic exclusion parameters were as follows; repeat count 1 repeat duration, 30 s; exclusion duration, 35 s; exclusion list size, 500.
Supplemental Information – Figures
SI Figure 1. Floccular and granular reactor performance measurements over the 161-day period showing phosphate (Pi) removal percentage, mixed liquor suspended solids (MLSS), and approximate enrichment of Accumulibacter as a percentage of the total bacterial community.
SI Figure 2. Contig coverage, GC content, and length from the metagenomic assemblies of the Floc and Gran reactors. Contigs binned as Accumulibacter are colored blue; contigs from the flanking community are colored red.
SI Figure 3. Principal component analysis of the protein abundances collected from the 20 metaproteomes from the Floc, Trans, and Gran biofilm states.
SI Figure 4. Heat map reporting the reciprocal correlations between protein abundances collected over the 161-day period from the Floc, Trans, and Gran biofilm states.
Supplemental Information – Tables
SI Table 1. Complete list of 2,022 unique proteins identified showing fold change of proteins in Floc versus Gran; metagenome identifier number; Taxonomy; Taxonomy identification method (LCA: lowest common ancestor, GB: genome bins); IMG identifier; protein annotation; COG category; protein sequence; Floc protein abundance; Gran protein abundance.
SI Table 2. List of 101 proteins identified in Floc with an alpha cutoff value of 0.05, showing fold change of proteins in Floc versus Gran; metagenome identifier number; Taxonomy; Taxonomy identification method (LCA: lowest common ancestor, GB: genome bins); IMG identifier; protein annotation; COG category; protein sequence; Floc protein abundance; Gran protein abundance.
SI Table 3. List of 101 proteins identified in Gran with an alpha cutoff value of 0.05, showing fold change of proteins in Floc versus Gran; metagenome identifier number; Taxonomy; Taxonomy identification method (LCA: lowest common ancestor, GB: genome bins); IMG identifier; protein annotation; COG category; protein sequence; Floc protein abundance; Gran protein abundance.
SI Table 4. List of proteins that were continually increasing or decreasing in abundance throughout the floccular-to-granular sludge transition period within the Gran reactor. Proteins were using a Spearman’s rank correlation coefficient analysis of normalized spectra counts and the time points collected during the transitionary period. Table showing increasing or decreasing Spearman rank correlation; metagenome identifier number; IMG identifier; protein annotation; COG category; protein sequence.
Supplemental Information – References
Barr JJ, Cook AE, Bond PL. (2010). Granule formation mechanisms within an aerobic wastewater system for phosphorus removal. Appl Environ Microbiol 76:7588–97.
Dick GJ, Andersson AF, Baker BJ, Simmons SL, Thomas BC, Yelton AP, et al. (2009). Community-wide analysis of microbial genome sequence signatures. Genome Biol 10:85.
Li H, Durbin R. (2009). Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25:1754–60.
Wood DE, Salzberg SL. (2014). Kraken: ultrafast metagenomic sequence classification using exact alignments. Genome Biol 15:46.
5