Project Summary for Fiscal Year (FY) 2003 project 12-Jul-2002 to 11-Jul-2003

, , ,

Technical Report Input Fields

Principal Investigator:

Firstname: George

Lastname: Church

Address 1: 200 Longwood Avenue

Address 2: Harvard Medical School

City: Boston

State: MA

Zip: 02115

Phone:617-432-7562

Fax: 617-432-7663

Email:

Level Of Participation - Billed: 10%

Unbilled: 20%

Project URL: http://arep.med.harvard.edu/darpabiocomp/

Quad Chart: http://arep.med.harvard.edu/darpabiocomp/ChurchQ01.ppt

Objective:

"DNA computing" so far has focused on computing calculations with DNA (which is slow relative to silicon computing). This project, in contrast, explores the other aspects of computing (i.e. input/output, memory & manufacturing processes). We do this using the only class of programmable nanometer scale replicators (i.e. polymerase-ribosome-based). The major challenge is integration with silicon computing while maintaining the nano-size advantages. The motivations are unattended bio-monitoring, 3D memory arrays cubic-nanometer per bit (currently 100 trillion), input from light, chemicals, and toxins; output as nm-scale positioning.

Approach:

The project focuses on real-world Input/Output issues including analog-to-digital (A/D) and digital-to-analog (D/A). For system input, we harvest a diverse set of biochemical and biophysical (photon) sensors. We also propose a novel fusion of DNA and RNA polymerases to decouple positioning from synthesis. For output we use polymerases for positioning mechanical effectors and hence rapidly synthesize three-dimensionally complex patterns of DNA, protein, and/or electro-optical (EO) computer circuits. This should be compatible with the DNA-bit programming done in the system input

<p>In order to improve the performance of the fabrication and memory tools, we will develop in vitro replication/translation arrays for experimental feedback. We will design a 90kbp minigenome capable of replication and protein-synthesis. This minigenome will be 6 times smaller than the smallest living cellular genomes, and display up to 800-fold faster replication, with 1000-fold fewer molecular components. These in vitro systems are ideal for integrating with detailed computational models, due to simplicity, knowledge of the 3D structure of nearly all components and extreme experimental accessibility. Also coupling the extremes of modeling (from single base changes to 3D structures to molecular networks to population doubling selection) is likely to be dramatically more transparent and tractable.

<p>Novel, Useful Applications & technology transfer We will focus from the start on practical applications that take advantage of the unique features of DNA rather than competing head-on with EO. Examples are: (a) proven information archiving and retrieval (up to a billion years as mineralized fossils or living DNA records); (b) interfacing with biochemical, photon, or thermal sensors. (c) A DNA recorder analogous to black-box flight recorder would take early advantage of our ability to record on DNA more easily than reading it. Only rarely would the archived materials be accessed. (d) Polymerases take 0.34 nm steps under control of available dNTPs. Novel methods for separating the positioning from the incorporation of reactive bases will allow nanofabrication.

Recent Accomplishments:

We have an in vitro coupled replicating and translating system based on bacterial E.coli translation extracts. For this we have developed (1) a linear expression clone system compatible with the most powerful in vitro replication system (PCR). (2) a modular method for computer gene design and automated gene synthesis including affinity-tagging for all ribosomal proteins.

<p> We have all 22 of the 30S subunit RNAs synthesized and most of the proteins. The remainder have revealed surprising properties of these most abundant cell proteins and suggest a strategy for overcoming the initial recalcitrance.

<p>In collaboration with Dr. Olejnik at Ambergen we have tested photocleavable fluorophore-base connection and are trying nitrobenzyl-3' blocking groups. In addition Dr. Pirrung from Duke University has sent NPPOC-3'blocked-dTTP which we (Jay Shendure and Greg Porreca) are testing. We have developed a design for modifying the DNA polymerase which we use to accomodate these bulky reversible 3' blocking group. It is becoming evident that steric hindrance of these 3' blockers has thwarted other research groups in the past from making progress and engineering the polymerase is an important and novel approach.

<p>We have developed a Minimization of Metabolic Adjustment (MOMA) software for optimization of metabolic network utilization in mutant genotypes. We have tested in extensively using metabolic fluxes (from Uwe Sauer’s group) and a new high-throughput method for measuring growth rates of hundreds of mutants in parallel.

<p> We have automated and made SBML & BioSpice compatible versions of MOMA, plus related web resources.

<p> We have developed methods for 3D & 4D modeling of bacterial cells and replication translation of their circular chromosomes. In addition we have 1D to 4D models of expansion of an in vitro DNA colony.

<p>We have completed genome sequencing for Mycoplasma mobile and a "complete" proteome comparison of M. mobile and M. pneumoniae. These are proving crucial for integration and 4D-modeling efforts.

Current Plan:

<ul>

<li<b>Task 1:In the upcoming FY, we will synthesize arrays of DNAs & polymerases for use with in situ sequencing in gels & chemically & photochemically cleavable links between fluorescent markers and dNTPs. This is critical to querying DNA memory systems in general as well as a variety of commercially established genomics applications.</b>

<br>We will extend our work on template-independent polymerases to synthesize stretches longer than 10 bp. Polony DNA fluorescent-base extension will be used for output. This would constitute proof of the key input and output methods.

</li>

<li<b>Task 2: We will assess ways to use synthetic genomes to program in vitro synthesis and assembly of small ribosomal subunits. .</b>

<br> This will allow development and optimization of commercially useful protein expression and display systems.</li>

<li<b>Task 3: We will extend our 3D chromosome model efforts.</b>

<br> We will help develop shareable computer descriptions of spatio-temporal models , e.g. extensions of SBML & BioSpice.

<br> We will test experimental methods for site-specific DNA crosslinking or heavy-atom labeling .

</li>

</ul>

Technology Transition:

How the impact of this work is measured: Literature citations and milestones of licensees set by HMS OTL (Maryanne Fenerjian <>)

<p>Prototype available for dissemination: In situ fluorescent base extension. Purpose: Identity and quantitation based on single DNA molecules. Environment requirements: Research laboratory. Point of contact / email address: Jay Shendure <> http://arep.med.harvard.edu/Polonator/

<p>System available for dissemination: Minimum Perturbation Analysis. Purpose: Optimization of metabolic network utilization in engineered (or mutant) genomes. Environment requirements: Research computers supporting Perl & C. Point of contact / email address: Daniel Segre <>

http://arep.med.harvard.edu/moma/

The following publications have resulted in part from this DARPA BioSPice funding:

Zhu,J, Shendure,J, Mitra, RD, Church, GM (2003) Single Molecule Profiling of Alternative Pre-mRNA Splicing. Science in press.

Segre, D, Zucker, J, Katz, J, Lin,X, Patrik D'haeseleer1, P, Rindone, W, Karchenko, P, Nguyen, D, Wright, M, and Church, GM (2003) From annotated genomes to metabolic flux models and kinetic parameter fitting. Omics in press.

King, OD, Lee, JC, Dudley, AM, Janse, DM, Church, GM, Roth, FP (2003) Predicting Phenotype from Patterns of Annotation. ISMB 2003 in press.

Grad Y, Kim J, Aach J, Hayes G, Reinhart B, Church GM, Ruvkun G. (2003) Computational and experimental identification of C. elegans microRNAs by comparative genomics. Submitted to Molecular Cell Jan 2003.

Merritt, J, DiTonno, JR, Mitra, RD, Church, GM, Edwards, JS (2003) Functional characterization of mutant yeast PGK1 within the context of the whole cell. submitted to PNAS Feb 2003

Mitra,RD, Shendure,J, Olejnik,J, Olejnik,EK, and Church,GM (2003) Fluorescent in situ Sequencing on Polymerase Colonies. Analyt. Biochem in press

Steffen, M, Jaffe, JD, & Church, GM (2003) Analysis of DNA-Binding Proteins by Mass Spectrometry. Submitted.

Mitra, RD, Butty, V, Shendure, J, Williams, BR, Housman, DE, and Church, GM (2003) Digital Genotyping and Haplotyping with Polymerase Colonies. PNAS in press

Jaffe JD, Berg, HC, Church GM (2003) Proteogenomic mapping reveals genomic structure and novel proteins undetected by computational algorithms. Submitted.

Lee M-LT, Bulyk ML, Whitmore GA, Church GM. (2003) A statistical model for investigating binding probabilities of DNA nucleotide sequences using microarrays. Biometrics 58(4):981-8.

Segre, D, Vitkup, D, and Church, GM (2002) Analysis of optimality in natural and perturbed metabolic networks. PNAS 99: 15112-7

Douglas W. Selinger, Rini Mukherjee Saxena, Kevin J. Cheung, George M. Church, and Carsten Rosenow (2003) Global RNA half-life analysis in Escherichia coli reveals positional patterns of transcript degradation. Genome Research Feb;13(2):216-23.

Sudarsanam,P., Pilpel,Y, and Church, G.M. (2002) Genome-wide co-occurrence of promoter elements reveals a cis-regulatory cassette of rRNA transcription motifs in S. cerevisiae . Genome Research 12: 1723-1731.

Cheung, KJ, Badarinarayana,V, Selinger, D, Janse, D, and Church, GM (2002)A microarray-based antibiotic screen identifies a regulatory role for supercoiling in the osmotic stress response of Escherichia coli. Genome Research 12: 1723-1731

Steffen, M, Petti, A, D'haeseleer, P, Aach,J, and Church, GM (2002) Computational Identification of Signal Transduction Networks. Bioinformatics 3:23.

Wright, M and Church,GM (2002) An Open-source Oligonucleotide Microarray Probe Standard for Human and Mouse. Nature Biotechnology 20(11):1082-3.

Shendure, J & Church, GM (2002) Computational discovery of sense-antisense transcription in the mouse and human genomes Genome Biology 3:1-14.

Schilling, CH, Covert, MW, Famili, I, Church, GM, Edwards, JS, Palsson, BO (2002) Genome-scale metabolic model of Helicobacter pylori 26695. J Bacteriol. 184(16):4582-93.

Dudley, AM, Aach, J, Steffen, MA, and Church, GM (2002) Measuring absolute expression with microarrays using a calibrated reference sample and an extended signal intensity range. Proc. Nat. Acad. Sci. USA 99:7554-7559.

Page 1