Summary: Identification of proteins bound to RNA.

Emanuele: can you please send the ppt file to Jason , who is doing the design,

Jason: can you exchange the fonts, and exchange some of the symbols with SR proteins, Us and hnRNPs.. In step 5 there should be only these proteins, in step 4 there can be other symbols, stars, etc.

All your µ symbols were lost in my version

Title: Identification of proteins bound to RNA

*

International Centre of Genetic Engineering and Biotechnology (ICGEB), Trieste, Italy.

Emanuele Buratti*

*address correspondence to: Emanuele Buratti, Padriciano 99, 34149 Trieste, Italy, Phone: +39-040-3757337, Fax: +39-040-3757361, E-mail: .

1.Abstract

It is now a well accepted fact that defects at the level of pre-mRNA processing pathways are a major cause of human disease. Up to now, most of these mutations have been detected in the relatively conserved basic splicing elements such as the donor, acceptor, and branch site regions where they affect the binding of well known splicing determinants such as UsnRNP factors. Increasingly, however, splicing mutations are being described within intronic and exonic regions of the pre-mRNA molecule far from canonical splicing signalswhere they disrupt binding of accessory splicing regulatory proteins. As these proteins play crucial roles in determining alternative and constitutive splicing levels, establishing their identity becomes essential to differentiate between potentially harmful mutations and harmless polymorphisms. Moreover, it allows us to better understand pathological mechanisms and, eventually, to plan for specific therapeutic strategies. In this chapter, we aim to provide a brief practical guide to identify these proteins using an easy to set up affinity purification procedure. In this technique any RNA sequence of interest can be used to derivatize agarose beads that are then incubated with protein mixes/cellular extracts to identify search for potentially interacting factors.

Keywords: RNA, RNA binding proteins, affinity purification.

2. Theoretical Background

RNA binding proteins (RBPs) regulate all aspects of post-transcriptional gene expression by affecting the biogenesis, stability, function, transport, and localization of all cellular RNAs produced in the eukaryotic nucleus, as recently reviewed by Glisovic et al. [1]and described in chapter 3 Allain. In general, these aggregates binding associations between RBPs and cellular RNAs are known asthe ribonucleoprotein complexes (RNPs). These complexes are formed by stacking, electrostatic, and hydrogen bonding interactions between regions of the various RBP proteins and selected nucleotides of an RNA molecule. In the case of proteins, the regions responsible for the direct interaction are often arranged in evolutionary conserved motifs that provide a specific three dimensional conformation. To this date, several major types of RNA binding structures have been described to mediate RNA-protein recognition: double stranded RNA binding motif (dsRBM), Pumilio homology domain (PUF) and RGG repeats, Zinc-binding domains, KH domains, and RNA Recognition Motifs (RRM) domains. The molecular mechanisms that make some of these domains particularly suitable to bind specific RNA sequences/structures and the way they differ from each other has been the subject of numerous structural studies. These have been recently reviewed by a number of publications and the reader is referred to them for additional details [2,3], as well to chapter 3 Allain.

The flexibility of RNA target sequences and the presence of thousands of RBP proteins (are you mean 1000 different ones? I don’t think there are that many) in vertebrates means that very probably every RNA present in the eukaryotic nucleus will be complexedwith a variety of proteins in a more or less specific fashion. Beside the primary nucleotide sequence, a major modifier of RNA protein binding properties is represented by the eventualpresence of RNA secondary structures [4]. For example, it has been recently shown that proteins such as MBNL1 and U2AF65 can selectively compete for binding to the same RNA region depending on the presence of mutually exclusive RNA structures [5].

With regards to RBPs involved in splicing regulation, a place of honor should be reserved for the well known class of hnRNP factors that are among the most abundant RNA-binding proteins in the human nucleus, and are responsible for forming the core of most ribonucleoprotein complexes described up to now [6-8]. Another class of RBPs that is important for splicing regulation (but not only this) is represented by the serine-arginine (SR) class of factors that in many cases work antagonistically to hnRNP proteins in regulating splicing processes [9-11]. It is the combinatorial presence of all these factors, often binding very near to each other on a very narrow stretch of RNA sequence (such as a typical exon or selected regions therein), that determines the final functional outcome [12,13] (see chapter 4, Hertel).

In this respect, it should be noted that unravelling unraveling complex RNA-protein compositions is important not just to understand how pre-mRNA or mRNA molecules are processed. It is now clear, in fact, that in order to function properly all the small RNA families that have been discovered in recent years (small nuclear RNAs, small nucleolar RNAs, microRNAs, siRNAs and shRNAs) including the ever increasing number of regulatory noncoding RNAs (ncRNAs) are known to assemble as ribonucleoprotein complexes, whose composition is almost certain to regulate several aspects of their expression pathway and functional properties [14].(see chapter 2, Meister)

Last but not least, it should also be mentioned the increasing role played by RNA binding proteins in the pathological mechanisms mediated by pathogenic RNAs that result from the expansion of repeats in noncoding and coding regions [15].

Taken together, it is clear that methodologies that unravel the composition of RNP complexes represent an essential tool in today's research. Paradoxically, some of the more useful methodologies are based on classical biochemical techniques that had practically gone out of fashion towards the end of the 80's. This unexpected revival has been made possible by the recent advances in mass-spec analysis [16]. These novel techniques allow now to obtain the protein composition of RNP complexes with a speed, resolution, accuracy, and economic cost that makes their use an "affordable" approach for many labs. The affinity purification technique we describe in this reportchapter has been used successfully for the identification of splicing regulatory factors in various NF-1 donor sites [17], repeat sequences [18,19], splicing regulatory regions [20-22], and pseudoexon sequences [23]. The reader is referred also to these publications that provide further details on its potential application and results.

3.Protocol

3.1 RNA templates.

1) RNA templates can obtained in the following ways:

a) use commercial suppliers to synthesize them. This is generally the best course of action for sequences less than than 50 nucleotides in length (can you indicate some suppliers?).

b) cloning the sequence of interest in a pBluescript KS+ plasmid (Stratagene) or any other plasmid that contains a T7 promoter. In this case, it is advisable to make sure that the 5'end of the sequence to be transcribed is placed as near as possible to the the end of the T7 promoter and that a suitable restriction enzyme to linearize the plasmid is present at its 3' end. This is to minimize the length of plasmid-related RNA that will eventually be transcribed together with the sequence of interest. Is it important to have a 5’ overhang to avoid getting wrong transcripts, like in rnase protection?

c) amplifying the sequence of interest using a forward primer carrying a T7 polymerase target sequence at the 5' end and 12-15 complementary nucleotides at the 3' end (5’-taatacgactcactatagg(n)12-15-3’) and a reverse primer carrying 12-15 nt of the target sequence.

2) Products from steps (b) and (c) should be purified by phenol/chloroform extraction, precipitated using standard protocols (1/10 3M NaAc, 2.5 vol ethanol, -20ºC for 1 hr), and resuspended in RNase-free water to a concentration of approximately (~1 μg/μl).

3) Approximately 2μg of linearized plasmids/amplified products are transcribed using T7 RNA Polymerase (Stratagene) in the presence of transcription buffer (350 mM HEPES, pH=7.5, 30 mM MgCl2, 2 mM spermidine, and 40 mM dithiothreitol), 40 units of RNasin, 7 mM each of the four NTPs, and 60 units of T7 polymerase (1.5 units/μg).

Can you reformat this in a table version

DNA in water x ul

MgCl2 x ul

Can you also mention that the transcription buffer need to be at RT without precipitate

4) In general, one should perform three 40μl reactions for each RNA of interest, placing in each 1.5 ml Eppendorf tube 2μg of linearized plasmids/amplified products.

5) Following incubation for 2 h at 37 °C, the reactions are pooled, purified by a cycle(how many) of acid phenol, ph=4.7?/chloroform extraction, precipitated according to standard protocols, and resuspended in 40 µl RNase-free water. Usually, this approach yields the desiderd desired 15μg of transcribed RNA for the following steps (see Section 3.2). It is strongly recommended to check their production/integrity on a standard agarose gel.

3.2 Loading the beads with RNA.

1) The 500 pmoles of T7-transcribed RNA (approx. 15 µg of a 100mer RNA) previously dissolved in 40 µl of water are placed in a 1.5 ml Eppendorf test tube.

2) To each sample, add 360 µl of a 5 mM Sodium m-Periodate (Sigma, #S11448) solution in 0.1 mM NaOAc pH=5.0 (to prepare 50ml of this reagent dissolve 53mg of Sodium m-Periodate in 0.1M NaOAc pH=5). This reagent has to be prepared fresh each time.

3) This 400 µl reaction mix is incubated for 1 hour in the dark (each test tube wrapped in aluminium foil) at room temperature in a rotator wheel.

4) Each RNA is then ethanol precipitated according to standard protocols, washed once with EtOH 70%, and resuspended in 100 µl of 0.1 M NaOAC, pH 5.0. Be careful not to lose the very small pellet!.

5) in the meantime, take 100 µl of adipic acid dehydrazide agarose bead 50% slurry (Sigma, #A0802) for each RNA sample to be conjugated and place them in a 15 ml Falcon tube. Wash the beads four times with 10 ml of 0.1 M NaOAc pH 5.0. Each time spinning down at 3000 rpm ( what is the g for this?) for 5 minutes in a clinical centrifuge (we use a 581R, Eppendorf).

6) After the final wash, resuspend the beads pellet from step 4 at the bottom of the 10 ml tube calculating do you mean using? 300 µl of 0,1 M NaOAc pH= 5.0 for each RNA sample prepared in step 4.

7) After mixing well, take separate 300 µl aliquots and add them to the 100 µl of periodate-treated RNA from step 4.

8) incubate overnight in the dark at 4C the resulting 400 µl mix on a rotator (each test tube wrapped in aluminium foil).

3.3 Incubation with protein mix (Buffer A).

it becomes only clear later when that you use buffer B when there is a low signal from buffer A, could you clarify?

1) Pull down the beads incubated overnight at 4000 rpm what is the g for this?) for 5 min using a bench top Eppendorf minifuge (from now on we always use a Centrifuge 5415 D, Eppendorf). RNA loaded beads will often tend to cling to the side of the test tube, shake them off until they collect at the bottom of the tube by tapping gently the eppendorf tube on the side of the rack.

2) Throw outdiscard the supernatant and wash the RNA loaded beads twice with NaCl 2M. Then, spin down at 4000 rpm for 5 min in Eppendorf minifuge.

3) Wash the beads three times with 1.0 ml of Sol.D 1X (20 mM Hepes pH=7.9, 100 mM KCl, 0.2 mM EDTA pH=8.0, 100 mM DTT, 6% v/v Glycerol), spinning down at 4000 rpm for 5 min and discarding supernatant each time.

4) During the last spin down described in step 3 prepare the following 500 l mix for each RNA sample to be tested:

a) 50 l Sol.D 10X (200 mM Hepes pH=7.9, 2 mM EDTA pH=8.0, 1M DTT, 60% v/v Glycerol).

b) 50 l KCl 1M (add separately).

c) 100 l NE (approx. 10-15 g/l) or any other protein mix of interest.

d) 300 l H2O.

e) Heparin (200 g/l stock) to desired final concentration (0.5-2.5-5.0 g/l of the final volume).

5) Add 500 l Nuclear Extract/Protein mix to the individual eppendorfs and mix gently by manually shaking the tube.

6) Incubate on a rotor for 30 min. at RT.

7) Spin down the beads at 4000 rpm with an Eppendorf minifuge and remove as much protein mix as possible.

8) Wash the beads 4 times with 1.5 ml of Sol.D 1X by incubating them each time for 5 min on a rotating wheel at room temperature, each time spinning them down with an Eppendorf minifuge at 4000 rpm to remove the supernatant.

9) add 50 l of SDS loading buffer, denature, and load sample on a SDS-PAGE gel (for loading, it is recommended to use a glass Hamilton syringe in order to avoid loading the beads in the well).

3.4 Incubation with protein mix (Buffer B).

1) Pull down the beads incubated overnight at 4000 rpm for 5 min using an Eppendorf minifuge. RNA loaded beads will sometimes tend to cling to the side of the test tube, to shake them off gently tap the eppendorf tube on the rack side.

2) Throw out the supernatant and wash the RNA loaded beads twice with NaCl 2M. Then, spin down at 4000 rpm what is the g for this?) for 5 min in Eppendorf minifuge.

3) Wash the beads three times with 1.0 ml of Buffer B (5 mM HEPES pH= 7.9, 1 mM MgCl2, 0.8 mM Magnesium acetate), spinning down at 4000 rpm for 5 min and discarding supernatant each time.

4) During the last spin down described in step 3 prepare the following 500 l mix for each RNA sample to be tested:

a) 50 l Binding Buffer 10X (50mM Hepes pH=7.9, 10mM MgCl2, 8mM Mg Acetate, 5.2mM DTT, 7.5mM GTP, 10mM ATP, and 38v/v Glycerol).

b) 100 l NE (approx. 10-15 g/l) or any other protein mix of interest.

c) 350 l H2O.

d) Heparin (200 g/l stock) to desired final concentration (0.5-2.5-5.0 g/l of the final volume).

5) Add 500 l Nuclear Extract/Protein mix to the individual eppendorfs and mix gently by manually shaking the tube.

6) Incubate on a rotor for 30 min. at RT.

7) Spin down the beads at 4000 rpm with an Eppendorf minifuge and remove as much protein mix as possible.

8) Wash the beads 4 times with 1.5 ml of Buffer B (5 mM HEPES pH= 7.9, 1 mM MgCl2, 0.8 mM Magnesium acetate) by incubating them each time for 5 min on a rotating wheel at room temperature, each time spinning them down with an Eppendorf minifuge at 4000 rpm to remove the supernatant.

9) add 50 l of SDS loading buffer, denature, and load sample on a SDS-PAGE gel (for loading, it is recommended to use a glass Hamilton syringe in order to avoid loading the beads in the well).

4.Example of an experiment.

In previous studies from our lab, it has been reported that binding of a U1snRNP molecule to an intronic splicing processing element (ISPE) in intron 20 of the ATM gene was capable to inhibit a pathological pseudoexon inclusion. Inactivation of this element through a 4-nucleotide deletion (GTAA) caused inactivation of this binding, pseudoexon inclusion, and occurrence of ataxia telangiectasia in a patient [24]. Could you please add the respective plasmids into the eurasnet database, so people could use them as a control,

user: superadmin

pw:golgi

the user is baralleT, I called to plasmids ATM-wt and ATM-mut, but you can change the names

Using synthetic RNAs carrying either the wild type or mutated RNA sequence (Fig.1A), this loss in U1snRNP binding activity that was originally demonstrated through band-shift analysis (Fig.1B), can also be easily observed using our pulldown affinity technique (Fig.1C). In order to identify their identity, the bands of interest were cut out from the Coomassie stained gel. Internal sequence analysis from the Coomassie Blue-stained bands excised from the SDS-PAGE gel was performed using an electrospray ionization mass spectrometer (LCQ DECA XP, ThermoFinnigam). The bands were digested by trypsin and the resulting peptides were extracted with water and 60% acetonitrile/1% trifluoroacetic acid. The fragments were then analyzed by mass spectrometry, and the proteins were identified by analysis of the peptide MS/MS data with Turbo SEQUEST (ThermoFinnigam) and MASCOT (Matrix Science).

This example shows how it is usually better to use as control a related RNA sequence. In fact, the low/medium amount of background present in the two lanes can even be considered a *useful* feature as it allows specific differences to stand up more sharply and can also act as a loading/pulldown control. Of course, if rather than binding differences the interest was focused on characterizing all RNA-protein interactions of a specific RNA sequence then a better approach would have been to use naked beads or beads loaded with a completely unrelated RNA (usually the antisense strand of the intended target).

Can you also mention what is the advantage of your method over biotinylated RNAs?

5.Troubleshooting.

Problem / Reason+Solution
protein binding signals too strong or background in beads too high. / • Increase Heparin concentration added to the protein mix.
• Shorten the size of RNA targets bound to the beads (ideal length is normally less than 200 nt.).
• Make sure the protein mix added to the mix is NOT cloudy. If, after Heparin addition, the solution does not clear up to near transparency it is advisable to centrifuge briefly (approx. 5min. at 4000 rpm in a tabletop Eppendorf minifuge) and discard any eventual pellet.
• Control (ie. empty) beads have the tendency to absorb high molecular weight proteins (>100 kDa in MW).
Too weak protein binding signals to beads.
Signal from proteins bound to the beads too weak / • Failure of synthesized/synthetic RNAs binding to beads. Use fresh reagents. If problem persists, binding reactions to beads can be followed using a radioactively labelled RNA on a small experimental scale.
• Decrease Heparin concentration added to the protein mix.
• Increase protein extract concentration added to the protein mix.
• Use Binding Buffer B (section 3.3). This buffer tends to yield more protein signals than Sol.D (warning: it will also raise background binding levels especially with empty beads, if used as control).
Small or no differences detected in band intensities between different samples. / • Increase the size of RNA sequence analyzed (max. length is >1000 nt. in length).
• Compare RNA sequences that display clear functional differences (ie. gross deletion mutants etc.).
• Pre-incubate the protein mix with semi-specific RNA competitors (in addition to Heparin).
• Use a mass-spec compatible silver stain procedure to stain SDS-PAGE gels.

Figure Legends.

Figure 1- Example of RNA pulldown experiment using synthetic RNA oligos

Fig.1A shows the synthetic RNAs that carry either the wild-type (ATM) or the deleted sequence (ATM ). Fig.1B shows a band-shift experiment using labelled ATM WT RNA (lane 1) incubated in the presence of nuclear extract (lane 2), nuclear extract plus an antibody specific against the U1snRNP U1A protein (lane 3), nuclear extract plus a control antibody (lane 4). Samples are run on a 6% non-denaturing PAGE gel. Fig.1C shows a pulldown analysis using the ATM WT (lane 1) and ATM  (lane 2) RNAs bound to the adipic acid dehydrazide beads following incubation with commercial HeLa nuclear extract. Following addition of SDS-PAGE running buffer the beads-derived proteins were separated in a 12% denaturing SDS-PAGE gel and stained with Comassie Blue according to standard protocols. The bands indicated by arrows refer to the several U1snRNP components that are differentially binding to these two RNAs (U170K, U1A, and SmRNP proteins B and B') as determined by mass-spec analysis.