freshpatentsnav7small (2K)

3

views for this patent on FreshPatents.com
updated 06/14/13

    Free Services  

  • MONITOR KEYWORDS
  • Enter keywords & we'll notify you when a new patent matches your request (weekly update).

  • ORGANIZER
  • Save & organize patents so you can view them later.

  • RSS rss
  • Create custom RSS feeds. Track keywords without receiving email.

  • ARCHIVE
  • View the last few months of your Keyword emails.

  • COMPANY PATENTS
  • Patents sorted by company.

Riboswitches, methods for their use, and compositions for use with riboswitches   

pdficondownload pdfimage preview


Abstract: It has been discovered that certain natural mRNAs serve as metabolite-sensitive genetic switches wherein the RNA directly binds a small organic molecule. This binding process changes the conformation of the mRNA, which causes a change in gene expression by a variety of different mechanisms. Modified versions of these natural “riboswitches” (created by using various nucleic acid engineering strategies) can be employed as designer genetic switches that are controlled by specific effector compounds. Such effector compounds that activate a riboswitch are referred to herein as trigger molecules. The natural switches are targets for antibiotics and other small molecule therapies. In addition, the architecture of riboswitches allows actual pieces of the natural switches to be used to construct new non-immunogenic genetic control elements, for example the aptamer (molecular recognition) domain can be swapped with other non-natural aptamers (or otherwise modified) such that the new recognition domain causes genetic modulation with user-defined effector compounds. The changed switches become part of a therapy regimen—turning on, or off, or regulating protein synthesis. Newly constructed genetic regulation networks can be applied in such areas as living biosensors, metabolic engineering of organisms, and in advanced forms of gene therapy treatments. ...

Agent: Yale University - ,
Inventors: Ronald R. Breaker, Ali Nahvi, Narasimhan Sudarsan, Margaret S. Ebert, Wade Winkler, Jeffrey E. Barrick, John K. Wickiser
USPTO Applicaton #: #20110151471 - Class: 435 613 (USPTO) - 06/23/11 - Class 435 
Related Terms: Antibiotics   Designer   Domain   Engineering   Gene Expression   Gene Therapy   Protein   
view organizer monitor keywords


The Patent Description & Claims data below is from USPTO Patent Application 20110151471, Riboswitches, methods for their use, and compositions for use with riboswitches.

pdficondownload pdf

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a Divisional Application of U.S. application Ser. No. 12/492,866, filed Jun. 26, 2009, which is a Divisional Application of U.S. application Ser. No. 10/669,162, filed Sep. 22, 2003, which claims benefit of U.S. Provisional Application No. 60/412,468, filed Sep. 20, 2002. U.S. application Ser. No. 12/492,866, filed Jun. 26, 2009, U.S. application Ser. No. 10/669,162, filed Sep. 22, 2003, and U.S. Provisional Application No. 60/412,468, filed Sep. 20, 2002, are hereby incorporated herein by reference in their entirety.

STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH

This invention was made with government support under Grants NIH GM48858 and NIH GM559343 awarded by the National Institutes of Health, and Grant NSF EIA-0129939 awarded by the National Science Foundation. The government has certain rights in the invention.

REFERENCE TO SEQUENCE LISTING

The Sequence Listing submitted Feb. 23, 2011 as a text file named “YU—6—8406_AMD_AFD_Sequence_Listing.txt,” created on Feb. 17, 2011, and having a size of 234,978 bytes is hereby incorporated by reference pursuant to 37 C.F.R. §1.52(e)(5).

FIELD OF THE INVENTION

The disclosed invention is generally in the field of gene expression and specifically in the area of regulation of gene expression.

BACKGROUND OF THE INVENTION

Precision genetic control is an essential feature of living systems, as cells must respond to a multitude of biochemical signals and environmental cues by varying genetic expression patterns. Most known mechanisms of genetic control involve the use of protein factors that sense chemical or physical stimuli and then modulate gene expression by selectively interacting with the relevant DNA or messenger RNA sequence. Proteins can adopt complex shapes and carry out a variety of functions that permit living systems to sense accurately their chemical and physical environments. Protein factors that respond to metabolites typically act by binding DNA to modulate transcription initiation (e.g. the lac repressor protein; Matthews, K. S., and Nichols, J. C., 1998, Prog. Nucleic Acids Res. Mol. Biol. 58, 127-164) or by binding RNA to control either transcription termination (e.g. the PyrR protein; Switzer, R. L., et al., 1999, Prog. Nucleic Acids Res. Mol. Biol. 62, 329-367) or translation (e.g. the TRAP protein; Babitzke, P., and Gollnick, P., 2001, J. Bacteriol. 183, 5795-5802). Protein factors responds to environmental stimuli by various mechanisms such as allosteric modulation or post-translational modification, and are adept at exploiting these mechanisms to serve as highly responsive genetic switches (e.g. see Ptashne, M., and Gann, A. (2002). Genes and Signals. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.).

In addition to the widespread participation of protein factors in genetic control, it is also known that RNA can take an active role in genetic regulation. Recent studies have begun to reveal the substantial role that small non-coding RNAs play in selectively targeting mRNAs for destruction, which results in down-regulation of gene expression (e.g. see Hannon, G. J. 2002, Nature 418, 244-251 and references therein). This process of RNA interference takes advantage of the ability of short RNAs to recognize the intended mRNA target selectively via Watson-Crick base complementation, after which the bound mRNAs are destroyed by the action of proteins. RNAs are ideal agents for molecular recognition in this system because it is far easier to generate new target-specific RNA factors through evolutionary processes than it would be to generate protein factors with novel but highly specific RNA binding sites.

Although proteins fulfill most requirements that biology has for enzyme, receptor and structural functions, RNA also can serve in these capacities. For example, RNA has sufficient structural plasticity to form numerous ribozyme domains (Cech & Golden, Building a catalytic active site using only RNA. In: The RNA World R. F. Gesteland, T. R. Cech, J. F. Atkins, eds., pp. 321-350 (1998); Breaker, In vitro selection of catalytic polynucleotides. Chem. Rev. 97, 371-390 (1997)) and receptor domains (Osborne & Ellington, Nucleic acid selection and the challenge of combinatorial chemistry. Chem. Rev. 97, 349-370 (1997); Hermann & Patel, Adaptive recognition by nucleic acid aptamers. Science 287, 820-825 (2000)) that exhibit considerable enzymatic power and precise molecular recognition. Furthermore, these activities can be combined to create allosteric ribozymes (Soukup & Breaker, Engineering precision RNA molecular switches. Proc. Natl. Acad. Sci. USA 96, 3584-3589 (1999); Seetharaman et al., Immobilized riboswitches for the analysis of complex chemical and biological mixtures. Nature Biotechnol. 19, 336-341 (2001)) that are selectively modulated by effector molecules.

These properties of RNA are consistent with speculation (Gold et al., From oligonucleotide shapes to genomic SELEX: novel biological regulatory loops. Proc. Natl. Acad. Sci. USA 94, 59-64 (1997); Gold et al., SELEX and the evolution of genomes. Curr. Opin. Gen. Dev. 7, 848-851 (1997); Nou & Kadner, Adenosylcobalamin inhibits ribosome binding to btuB RNA. Proc. Natl. Acad. Sci. USA 97, 7190-7195 (2000); Gelfand et al., A conserved RNA structure element involved in the regulation of bacterial riboflavin synthesis genes. Trends Gen. 15, 439-442 (1999); Miranda-Rios et al., A conserved RNA structure (thi box) is involved in regulation of thiamin biosynthetic gene expression in bacteria. Proc. Natl. Acad. Sci. USA 98, 9736-9741 (2001); Stormo & Ji, Do mRNAs act as direct sensors of small molecules to control their expression? Proc. Natl. Acad. Sci. USA 98, 9465-9467 (2001)) that certain mRNAs might employ allosteric mechanisms to provide genetic regulatory responses to the presence of specific metabolites. Although a thiamine pyrophosphate (TPP)-dependent sensor/regulatory protein had been proposed to participate in the control of thiamine biosynthetic genes (Webb & Downs, Characterization of thiL, encoding thiamin-monophosphate kinase, in Salmonella typhimurium. J. Biol. Chem. 272, 15702-15707 (1997)), no such protein factor has been shown to exist.

Transcription of the lysC gene of B. subtilis is repressed by high concentrations of lysine (Kochhar, S., and Paulus, H. 1996, Microbiol. 142:1635-1639; Mäder, U., et al., 2002, J. Bacteriol. 184:4288-4295; Patte, J. C. 1996. Biosynthesis of lysine and threonine. In: Escherichia coli and Salmonella: Cellular and Molecular Biology, F. C. Neidhardt, et al., eds., Vol. 1, pp. 528-541. ASM Press, Washington, D.C.; Patte, J.-C., et al., 1998, FEMS Microbiol. Lett. 169:165-170), but that no protein factor had been identified that served as the genetic regulator (Liao, H.-H., and Hseu, T.-H. 1998, FEMS Microbiol. Lett. 168:31-36). The lysC gene encodes aspartokinase II, which catalyzes the first step in the metabolic pathway that converts L-aspartic acid into L-lysine (Belitsky, B. R. 2002. Biosynthesis of amino acids of the glutamate and aspartate families, alanine, and polyamines. In: Bacillus subtilis and its Closest Relatives: from Genes to Cells. A. L. Sonenshein, J. A. Hoch, and R. Losick, eds., ASM Press, Washington, D.C.).

BRIEF

SUMMARY

OF THE INVENTION

It has been discovered that certain natural mRNAs serve as metabolite-sensitive genetic switches wherein the RNA directly binds a small organic molecule. This binding process changes the conformation of the mRNA, which causes a change in gene expression by a variety of different mechanisms. Modified versions of these natural “riboswitches” (created by using various nucleic acid engineering strategies) can be employed as designer genetic switches that are controlled by specific effector compounds. Such effector compounds that activate a riboswitch are referred to herein as trigger molecules. The natural switches are targets for antibiotics and other small molecule therapies. In addition, the architecture of riboswitches allows actual pieces of the natural switches to be used to construct new non-immunogenic genetic control elements, for example the aptamer (molecular recognition) domain can be swapped with other non-natural aptamers (or otherwise modified) such that the new recognition domain causes genetic modulation with user-defined effector compounds. The changed switches become part of a therapy regimen—turning on, or off, or regulating protein synthesis. Newly constructed genetic regulation networks can be applied in such areas as living biosensors, metabolic engineering of organisms, and in advanced forms of gene therapy treatments.

Disclosed are isolated and recombinant riboswitches, recombinant constructs containing such riboswitches, heterologous sequences operably linked to such riboswitches, and cells and transgenic organisms harboring such riboswitches, riboswitch recombinant constructs, and riboswitches operably linked to heterologous sequences. The heterologous sequences can be, for example, sequences encoding proteins or peptides of interest, including reporter proteins or peptides. Preferred riboswitches are, or are derived from, naturally occurring riboswitches.

Also disclosed are chimeric riboswitches containing heterologous aptamer domains and expression platform domains. That is, chimeric riboswitches are made up an aptamer domain from one source and an expression platform domain from another source. The heterologous sources can be from, for example, different specific riboswitches or different classes of riboswitches. The heterologous aptamers can also come from non-riboswitch aptamers. The heterologous expression platform domains can also come from non-riboswitch sources.

Also disclosed are compositions and methods for selecting and identifying compounds that can activate, deactivate or block a riboswitch. Activation of a riboswitch refers to the change in state of the riboswitch upon binding of a trigger molecule. A riboswitch can be activated by compounds other than the trigger molecule and in ways other than binding of a trigger molecule. The term trigger molecule is used herein to refer to molecules and compounds that can activate a riboswitch. This includes the natural or normal trigger molecule for the riboswitch and other compounds that can activate the riboswitch. Natural or normal trigger molecules are the trigger molecule for a given riboswitch in nature or, in the case of some non-natural riboswitches, the trigger molecule for which the riboswitch was designed or with which the riboswitch was selected (as in, for example, in vitro selection or in vitro evolution techniques). Non-natural trigger molecules can be referred to as non-natural trigger molecules.

Deactivation of a riboswitch refers to the change in state of the riboswitch when the trigger molecule is not bound. A riboswitch can be deactivated by binding of compounds other than the trigger molecule and in ways other than removal of the trigger molecule. Blocking of a riboswitch refers to a condition or state of the riboswitch where the presence of the trigger molecule does not activate the riboswitch.

Also disclosed are compounds, and compositions containing such compounds, that can activate, deactivate or block a riboswitch. Also disclosed are compositions and methods for activating, deactivating or blocking a riboswitch. Riboswitches function to control gene expression through the binding or removal of a trigger molecule. Compounds can be used to activate, deactivate or block a riboswitch. The trigger molecule for a riboswitch (as well as other activating compounds) can be used to activate a riboswitch. Compounds other than the trigger molecule generally can be used to deactivate or block a riboswitch. Riboswitches can also be deactivated by, for example, removing trigger molecules from the presence of the riboswitch. A riboswitch can be blocked by, for example, binding of an analog of the trigger molecule that does not activate the riboswitch.

Also disclosed are compositions and methods for altering expression of an RNA molecule, or of a gene encoding an RNA molecule, where the RNA molecule includes a riboswitch, by bringing a compound into contact with the RNA molecule. Riboswitches function to control gene expression through the binding or removal of a trigger molecule. Thus, subjecting an RNA molecule of interest that includes a riboswitch to conditions that activate, deactivate or block the riboswitch can be used to alter expression of the RNA. Expression can be altered as a result of, for example, termination of transcription or blocking of ribosome binding to the RNA. Binding of a trigger molecule can, depending on the nature of the riboswitch, reduce or prevent expression of the RNA molecule or promote or increase expression of the RNA molecule.

Also disclosed are compositions and methods for regulating expression of an RNA molecule, or of a gene encoding an RNA molecule, by operably linking a riboswitch to the RNA molecule. A riboswitch can be operably linked to an RNA molecule in any suitable manner, including, for example, by physically joining the riboswitch to the RNA molecule or by engineering nucleic acid encoding the RNA molecule to include and encode the riboswitch such that the RNA produced from the engineered nucleic acid has the riboswitch operably linked to the RNA molecule. Subjecting a riboswitch operably linked to an RNA molecule of interest to conditions that activate, deactivate or block the riboswitch can be used to alter expression of the RNA.

Also disclosed are compositions and methods for regulating expression of a naturally occurring gene or RNA that contains a riboswitch by activating, deactivating or blocking the riboswitch. If the gene is essential for survival of a cell or organism that harbors it, activating, deactivating or blocking the riboswitch can in death, stasis or debilitation of the cell or organism. For example, activating a naturally occurring riboswitch in a naturally occurring gene that is essential to survival of a microorganism can result in death of the microorganism (if activation of the riboswitch turns off or represses expression). This is one basis for the use of the disclosed compounds and methods for antimicrobial and antibiotic effects.

Also disclosed are compositions and methods for regulating expression of an isolated, engineered or recombinant gene or RNA that contains a riboswitch by activating, deactivating or blocking the riboswitch. The gene or RNA can be engineered or can be recombinant in any manner. For example, the riboswitch and coding region of the RNA can be heterologous, the riboswitch can be recombinant or chimeric, or both. If the gene encodes a desired expression product, activating or deactivating the riboswitch can be used to induce expression of the gene and thus result in production of the expression product. If the gene encodes an inducer or repressor of gene expression or of another cellular process, activation, deactivation or blocking of the riboswitch can result in induction, repression, or de-repression of other, regulated genes or cellular processes. Many such secondary regulatory effects are known and can be adapted for use with riboswitches. An advantage of riboswitches as the primary control for such regulation is that riboswitch trigger molecules can be small, non-antigenic molecules.

Also disclosed are compositions and methods for altering the regulation of a riboswitch by operably linking an aptamer domain to the expression platform domain of the riboswitch (which is a chimeric riboswitch). The aptamer domain can then mediate regulation of the riboswitch through the action of, for example, a trigger molecule for the aptamer domain. Aptamer domains can be operably linked to expression platform domains of riboswitches in any suitable manner, including, for example, by replacing the normal or natural aptamer domain of the riboswitch with the new aptamer domain. Generally, any compound or condition that can activate, deactivate or block the riboswitch from which the aptamer domain is derived can be used to activate, deactivate or block the chimeric riboswitch.

Also disclosed are compositions and methods for inactivating a riboswitch by covalently altering the riboswitch (by, for example, crosslinking parts of the riboswitch or coupling a compound to the riboswitch). Inactivation of a riboswitch in this manner can result from, for example, an alteration that prevents the trigger molecule for the riboswitch from binding, that prevents the change in state of the riboswitch upon binding of the trigger molecule, or that prevents the expression platform domain of the riboswitch from affecting expression upon binding of the trigger molecule.

Also disclosed are methods of identifying compounds that activate, deactivate or block a riboswitch. For examples, compounds that activate a riboswitch can be identified by bringing into contact a test compound and a riboswitch and assessing activation of the riboswitch. If the riboswitch is activated, the test compound is identified as a compound that activates the riboswitch. Activation of a riboswitch can be assessed in any suitable manner. For example, the riboswitch can be linked to a reporter RNA and expression, expression level, or change in expression level of the reporter RNA can be measured in the presence and absence of the test compound. As another example, the riboswitch can include a conformation dependent label, the signal from which changes depending on the activation state of the riboswitch. Such a riboswitch preferably uses an aptamer domain from or derived from a naturally occurring riboswitch. As can be seen, assessment of activation of a riboswitch can be performed with the use of a control assay or measurement or without the use of a control assay or measurement. Methods for identifying compounds that deactivate a riboswitch can be performed in analogous ways.

Identification of compounds that block a riboswitch can be accomplished in any suitable manner. For example, an assay can be performed for assessing activation or deactivation of a riboswitch in the presence of a compound known to activate or deactivate the riboswitch and in the presence of a test compound. If activation or deactivation is not observed as would be observed in the absence of the test compound, then the test compound is identified as a compound that blocks activation or deactivation of the riboswitch.

Also disclosed are biosensor riboswitches. Biosensor riboswitches are engineered riboswitches that produce a detectable signal in the presence of their cognate trigger molecule. Useful biosensor riboswitches can be triggered at or above threshold levels of the trigger molecules. Biosensor riboswitches can be designed for use in vivo or in vitro. For example, biosensor riboswitches operably linked to a reporter RNA that encodes a protein that serves as or is involved in producing a signal can be used in vivo by engineering a cell or organism to harbor a nucleic acid construct encoding the riboswitch/reporter RNA. An example of a biosensor riboswitch for use in vitro is a riboswitch that includes a conformation dependent label, the signal from which changes depending on the activation state of the riboswitch. Such a biosensor riboswitch preferably uses an aptamer domain from or derived from a naturally occurring riboswitch. Also disclosed are methods of detecting compounds using biosensor riboswitches. The method can include bringing into contact a test sample and a biosensor riboswitch and assessing the activation of the biosensor riboswitch. Activation of the biosensor riboswitch indicates the presence of the trigger molecule for the biosensor riboswitch in the test sample.

Also disclosed are compounds made by identifying a compound that activates, deactivates or blocks a riboswitch and manufacturing the identified compound. This can be accomplished by, for example, combining compound identification methods as disclosed elsewhere herein with methods for manufacturing the identified compounds. For example, compounds can be made by bringing into contact a test compound and a riboswitch, assessing activation of the riboswitch, and, if the riboswitch is activated by the test compound, manufacturing the test compound that activates the riboswitch as the compound.

Also disclosed are compounds made by checking activation, deactivation or blocking of a riboswitch by a compound and manufacturing the checked compound. This can be accomplished by, for example, combining compound activation, deactivation or blocking assessment methods as disclosed elsewhere herein with methods for manufacturing the checked compounds. For example, compounds can be made by bringing into contact a test compound and a riboswitch, assessing activation of the riboswitch, and, if the riboswitch is activated by the test compound, manufacturing the test compound that activates the riboswitch as the compound. Checking compounds for their ability to activate, deactivate or block a riboswitch refers to both identification of compounds previously unknown to activate, deactivate or block a riboswitch and to assessing the ability of a compound to activate, deactivate or block a riboswitch where the compound was already known to activate, deactivate or block the riboswitch.

Also disclosed are methods for selecting, designing or deriving new riboswitches and/or new aptamers that recognize new trigger molecules. Such methods can involve production of a set of aptamer variants in a riboswitch, assessing the activation of the variant riboswitches in the presence of a compound of interest, selecting variant riboswitches that were activated (or, for example, the riboswitches that were the most highly or the most selectively activated), and repeating these steps until a variant riboswitch of a desired activity, specificity, combination of activity and specificity, or other combination of properties results. Also disclosed are riboswitches and aptamer domains produced by these methods.

The disclosed riboswitches, including the derivatives and recombinant forms thereof, generally can be from any source, including naturally occurring riboswitches and riboswitches designed de novo. Any such riboswitches can be used in or with the disclosed methods. However, different types of riboswitches can be defined and some such sub-types can be useful in or with particular methods (generally as described elsewhere herein). Types of riboswitches include, for example, naturally occurring riboswitches, derivatives and modified forms of naturally occurring riboswitches, chimeric riboswitches, and recombinant riboswitches. A naturally occurring riboswitch is a riboswitch having the sequence of a riboswitch as found in nature. Such a naturally occurring riboswitch can be an isolated or recombinant form of the naturally occurring riboswitch as it occurs in nature. That is, the riboswitch has the same primary structure but has been isolated or engineered in a new genetic or nucleic acid context. Chimeric riboswitches can be made up of, for example, part of a riboswitch of any or of a particular class or type of riboswitch and part of a different riboswitch of the same or of any different class or type of riboswitch; part of a riboswitch of any or of a particular class or type of riboswitch and any non-riboswitch sequence or component. Recombinant riboswitches are riboswitches that have been isolated or engineered in a new genetic or nucleic acid context.

Different classes of riboswitches refer to riboswitches that have the same or similar trigger molecules or riboswitches that have the same or similar overall structure (predicted, determined, or a combination). Riboswitches of the same class generally, but need not, have both the same or similar trigger molecules and the same or similar overall structure.

Additional advantages of the disclosed method and compositions will be set forth in part in the description which follows, and in part will be understood from the description, or can be learned by practice of the disclosed method and compositions. The advantages of the disclosed method and compositions will be realized and attained by means of the elements and combinations particularly pointed out in the appended claims. It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention as claimed.

BRIEF DESCRIPTION OF THE DRAWINGS

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate several embodiments of the disclosed method and compositions and together with the description, serve to explain the principles of the disclosed method and compositions.

FIGS. 1A and 1B show metabolite-dependent conformational changes in the 202-nucleotide leader sequence of the btuB mRNA. FIG. 1A shows separation of spontaneous RNA-cleavage products of the btuB leader using denaturing 10% polyacrylamide gel electrophoresis (PAGE). 5′-32p-labeled mRNA leader molecules (arrow) were incubated for 41 hr at 25° C. in 20 mM MgCl2, 50 mM Tris-HCl (pH 8.3 at 25° C.) in the presence (+) or absence (−) of 20 μM of AdoCbl. Lanes containing RNAs that have undergone no reaction, partial digest with alkali, and partial digest with RNase T1 (G-specific cleavage) are identified by NR, −OH, and T1, respectively. The location of product bands corresponding to cleavage after selected guanosine residues are identified by filled arrowheads. Arrowheads labeled 1 through 8 identify eight of the nine locations that exhibit effector-induced structure modulation, which experience an increase or decrease in the rate of spontaneous RNA cleavage. The image was generated using a phosphorimager (Molecular Dynamics), and cleavage yields were quantitated by using ImageQuant software. FIG. 1B shows sequence and secondary-structure model for the 202-nucleotide leader sequence of btuB mRNA (SEQ ID NO:1) in the presence of AdoCbl. Putative base-paired elements are designated P1 through P9. Complementary nucleotides in the loops of P4 and P9 that have the potential to form a pseudoknot are juxtaposed. Nine specific sites of structure modulation are identified by arrowheads. The asterisks demark the boundaries of the B12 box (nucleotides 141-162). The coding region and the 38 nucleotides that reside immediately 5′ of the start codon (nucleotides 241-243) were not included in the 202-nucleotide fragment. The 315-nucleotide fragment includes the 202-nucleotide fragment, the remaining 38 nucleotides of the leader sequence, and the first 75 nucleotides of the coding region.

FIGS. 2A and 2B show the btuB mRNA leader forms a saturable binding site for AdoCbl. FIG. 2A shows the dependence of spontaneous cleavage of btuB mRNA leader on the concentration of AdoCbl effector as represented by site 1 (G23) and site 2 (U68). 5′-32P-labeled mRNA leader molecules were incubated, separated, and analyzed as described in the in the brief description of FIG. 1, and include identical control and marker lanes as indicated. Incubations contained concentrations of AdoCbl ranging from 10 nM to 100 μM (lanes 1 though 8) or did not include AdoCbl (−). FIG. 2B shows a composite plot of the fraction of RNA cleaved at six locations along the mRNA leader versus the logarithm of the concentration (c) of AdoCbl. Fraction cleaved values were normalized relative to the highest and lowest cleavage values measured for each location, including the values obtained upon incubation in the absence of AdoCbl. The inset defines the symbols used for each of six sites, while the remaining three sites were excluded from the analysis due to weak or obscured cleavage bands. Filled and open symbols represent increasing and decreasing cleavage yields, respectively, upon increasing the concentration of AdoCbl. The dashed line reflects a KD of ˜300 nM, as predicted by the concentration needed to generate half-maximal structural modulation. Data plotted were derived from a single PAGE analysis, of which two representative sections are depicted in FIG. 1A.

FIG. 3 shows the 202-nucleotide mRNA leader causes an unequal distribution of AdoCbl in an equilibrium dialysis apparatus. I: Equilibration of tritiated effector was conducted in the absence of RNA. II: (step 1) Equilibration was conducted as in I, but with 200 pmoles of mRNA leader added to chamber b; (step 2) 5,000 pmoles of unlabeled AdoCbl was added to chamber b. III: Equilibrations were conducted as described in II, but wherein 5,000 pmoles of cyanocobalamin was added to chamber b. IV: (step 1) Equilibration was initiated as described in step 1 of II; (steps 2 and 3) the solution in chamber a was replaced with 25 μL of fresh equilibration buffer; (step 4) 5,000 pmoles of unlabeled AdoCbl was added to chamber b. The cpm ratio is the ratio of counts detected in chamber b relative to that of a. The dashed line represents a cpm ratio of 1, which is expected if equal distribution of tritium is established.

FIGS. 4A and 4B show selective molecular recognition of effectors by the btuB mRNA leader. FIG. 4A shows a chemical structure of AdoCbl (1) and various effector analogs (2 through 11, ref 30). FIG. 4B shows a determination of analog binding by monitoring modulation of spontaneous cleavage of the 202-nucleotide btuB RNA leader. 5′-32P-labeled mRNA leader molecules were incubated, separated, and analyzed as described in the legend to FIG. 1A, and include identical control and marker lanes as indicated. The sections of three PAGE analyses encompassing site 2 (U68) are depicted. Below each image is plotted the amount of RNA cleaved (normalized with relation to the lowest and highest levels of cleavage at U68 in each gel) for each effector as indicated, or for no effector (−). The compound 11 (13-epi-AdoCbl) is an epimer of AdoCbl wherein the configuration at C13 is inverted, so that the e propionamide side chain is above the plane of the corrin ring; see Brown et al., Conformational studies of 5′-deoxyadenosyl-13-epicobalamin, a coenzymatically active structural analog of coenzyme B12. Polyhedron 17, 2213 (1998).

FIGS. 5A, 5B, 5C, 5D, 5E and 5F show mutations in the mRNA leader and their effects on AdoCbl binding and genetic control. FIG. 5A shows sequence of the putative P5 element of the wild-type 202-nucleotide btuB leader exhibits AdoCbl-dependent modulation of structure as indicated by the observed increase in spontaneous RNA cleavage at position U68 (10% denaturing PAGE gel). Assays were conducted in the absence (−) or presence (+) of 5 μM AdoCbl. The remaining lanes are as described in the legend to FIG. 1A. The composite bar graph reflects the ability of the RNA to shift the equilibrium of AdoCbl in an equilibrium dialysis apparatus and the ability of a reporter gene (see Experimental Procedures) to be regulated by AdoCbl addition to a bacterial culture. (Left) Plotted is the cpm ratio derived by equilibrium dialysis, wherein chamber b contains the RNA. Details of the equilibrium dialysis experiments are described in the brief description of FIG. 3. (Right) Plotted are the expression levels of β-galactosidase as determined from cells grown in the absence (−) or presence (+) of 5 μM AdoCbl. Boxed numbers on the left and right, respectively, reflect the approximate KD and the fold repression of β-galactosidase activity in the presence of AdoCbl. N.D. designates not determined. FIG. 5B-5F shows sequences and performance characteristics of various mutant leader sequences as indicated. Constructs were created as described in the Experimental Procedures section.

FIGS. 6A, 6B, 6C and 6D show metabolite binding by mRNAs. FIG. 6A shows TPP-dependent modulation of the spontaneous cleavage of 165 thiM RNA was visualized by polyacrylamide gel electrophoresis (PAGE). 5′ 32P-labeled RNAs (arrow, 20 nM) were incubated for approximately 40 hr at 25° C. in 20 mM MgCl2, 50 mM Tris-HCl (pH 8.3 at 25° C.) in the presence (+) or absence (−) of 100 μM TPP. NR, −OH and T1 represent RNAs subjected to no reaction, partial digestion with alkali, or partial digestion with RNase T1 (G-specific cleavage), respectively. Product bands representing cleavage after selected G residues are numbered and identified by filled arrowheads. The asterisk identifies modulation of RNA structure involving the Shine-Dalgarno (SD) sequence. Gel separations were analyzed using a phosphorimager (Molecular Dynamics) and quantitated using ImageQuant software. FIG. 6B shows a secondary-structure model of 165 thiM (SEQ ID NO:2) as predicted by computer modeling (Zuker et al., Algorithms and thermodynamics for RNA secondary structure prediction: a practical guide. In RNA Biochemistry and Biotechnology (eds. Barciszewski J. & Clark, B. F. C.) 11-43 (NATO ASI Series, Kluwer Academic Publishers, 1999); Mathews et al., Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structure. J. Mol. Biol. 288, 911-940 (1999)) and by the structure probing data depicted in FIG. 6A. Spontaneous cleavage characteristics are as noted in the inset. Unmarked nucleotides exhibit a constant but low level of degradation. The truncated 91 thiM RNA (residues 1-91 of SEQ ID NO:2) is boxed and the thi box element (Miranda-Rios et al., A conserved RNA structure (thi box) is involved in regulation of thiamin biosynthetic gene expression in bacteria. Proc. Natl. Acad. Sci. USA 98, 9736-9741 (2001)) is shaded. Nucleotides enclosed in boxes identify an alternative pairing, designated P8*. The RNA carries two mutations (G156A and U157C) relative to wild type that were introduced in a non-essential portion of the construct to form a restriction site for cloning, while all RNAs carry two 5′-terminal G residues to facilitate in vitro transcription. FIG. 6C shows TPP-dependent modulation of the spontaneous cleavage of 240 thiC RNA. Reactions were conducted and analyzed as described in above for FIG. 6A. FIG. 6D shows a secondary-structure model of 240 thiC (SEQ ID NO:3). Base-paired elements that are similar to those of thiM are labeled P1 through P5. The truncated RNA 111 thiC (residues 1-111 of SEQ ID NO:3) is boxed. Nucleotides enclosed in boxes identify an alternative pairing.

FIGS. 7A, 7B and 7C show the thiM and thiC mRNA leaders serve as high-affinity metabolite receptors. FIG. 7A shows the extent of spontaneous modulation of RNA cleavage at several sites within 165 thiM (left) and 240 thiC (right) plotted for different concentrations (c) of TPP. Arrows reflect the estimated concentration of TPP needed to attain half maximal modulation of RNA (apparent KD). FIG. 7B shows the logarithm of the apparent KD values plotted for both RNAs with TPP, TP and thiamine as indicated. The boxed data was generated using TPP with the truncated RNAs 91 thiM and 111 thiC. FIG. 7C shows that patterns of spontaneous cleavage of 165 thiM differ between thiamine and TPP ligands as depicted by PAGE analysis (left) and as reflected by graphs (right) representing the relative phosphorimager counts for the three lanes as indicated. Details for the RNA probing analysis are similar to those described above in connection with FIG. 6A. The graphs were generated by ImageQuant software.

FIGS. 8A, 8B, 8C and 8D show high sensitivity and selectivity of mRNA leaders for metabolite binding. FIG. 8A shows chemical structures of several analogues of thiamine. TD is thiamine disulfide and THZ is 4-methyl-5-β-hydroxyethylthiazole. FIG. 8B shows PAGE analysis of 165 thiM RNA structure probing using TPP and various chemical analogues (40 μM each) as indicated. Locations of significant structural modulation within the RNA spanning nucleotides ˜113 to ˜150 are indicated by open arrowheads. The asterisk identifies the site (C144) used to compare the normalized fraction of RNA that is cleaved (bottom) in the presence of specific compounds. Details for the RNA probing analysis are similar to those described above in connection with FIG. 6a. FIG. 8C shows a summary of the features of TPP that are critical for molecular recognition. FIG. 8D shows equilibrium dialysis using 3H-thiamine as a tracer. Plotted are the ratios for tritium distribution in a two-chamber system (a and b) that were established upon equilibration in the presence of the RNA constructs in chamber b as indicated (see below for a description of the non-TPP-binding mutant M3). 100 μM TPP or oxythiamine were added to chamber a, as denoted, upon the start of equilibration.

FIGS. 9A, 9B, 9C and 9D show mutational analysis of the structure and function of the thiM riboswitch. FIG. 9A shows mutations present in constructs M1 through M8 relative to the 165 thiM RNA (SEQ ID NO:4). P8* is a putative base-paired element between portions (encircled) of the P1 and P8 stems. FIGS. 9B and 9C show in vitro ligand-binding and genetic control functions of the wild-type (WT), M1 and M2 RNAs as reflected by PAGE analysis of in-line probing experiments (10 μM TPP) and by β-galactosidase expression assays. Labels on PAGE gels are as described above in connection with FIG. 6A. Bars represent the levels of gene expression in the presence (+) and the absence (−) of TPP in the culture medium. FIG. 9D is a summary of similar analyses of WT through M9 is presented in table form. The SD status “n.d.” (not determined) indicates either that the level of spontaneous cleavage detected in the absence and presence of TPP is near the limit of detection (M6, M7 and M8) or that the region adopts an atypical structure (M9) compared to WT.

FIG. 10 shows a construct for the selection of SAM-responsive ribozymes (SEQ ID NO:5). The hammerhead self-cleaving ribozyme and the SAM aptamer both require proper formation of the bridge domain to exhibit function. Therefore, the selection is expected to permit ribozyme function only when SAM or another binding-competent analog is present.

FIGS. 11A (SEQ ID NO:6 and SEQ ID NOs:378-382), 11B (SEQ ID NO:7 and SEQ ID NOs:383-385), 11C (SEQ ID NO:8 and SEQ ID NOs:386-387), 11D (SEQ ID NO:9 and SEQ ID NOs:388-389), 11E (SEQ ID NO:10), 11F (SEQ ID NO:11) and 11G (SEQ ID NO:12 and SEQ ID NOs:390-397) show consensus sequences and putative secondary structures were derived by phylogenetic and biochemical analyses as described for each riboswitch (see references). Nucleotides identified by a lower case a, c, t, or g, are conserved in greater than 90% of the representative sequences, open circles identify nucleotide positions of variable sequence, and lines identify elements that are variable in sequence and length. Models are described as follows: 11A) coenzyme B12 aptamer (Example 1); 11B) TPP aptamer (Example 2); 11C) FMN aptamer (Example 3); 11D) SAM aptamer (Example 7); 11E) guanine aptamer (Example 6); 11F) adenine aptamer (Example 8); and 11G) lysine aptamer Example 5). Letters R and Y represent purine and pyrimidine bases, respectively; K designates G or U; W designates A or U; H designates A, C, or U; D designates G, A, or U; N represents any of the four bases.

FIGS. 12A (SEQ ID NO:13), 12B and 12C show the regulation of the B. subtilis ribD mRNA by FMN. FIG. 12A shows the results of in-line probing assays. Internucleotide linkages identified with squares exhibit decreased amounts of spontaneous cleavage when ribD is incubated in the presence of FMN (indicating an increase in order for these nucleotides) relative to incubation in the absence of FMN. Circles identify linkages that exhibit consistently high levels of scission, which indicates they are not modulated by presence of FMN. FIG. 12B shows a model for the mechanism of ribD regulation. The ribD mRNA adopts anti-termination conformation in the absence of FMN. Increased levels of FMN stabilize an RFN-FMN complex that permits formation of the terminator structure. FIG. 12C shows the chemical structure and apparent dissociation constants for riboflavin and FMN.

FIGS. 13A (residues 1-91 of SEQ ID NO:2), 13B and 13C show the regulation of the E. coli thiM mRNA by TPP. FIG. 13A shows results of in-line probing assays. Internucleotide linkages identified with squares exhibit decreased amounts of spontaneous cleavage when thiM is incubated in the presence of TPP compared to incubation in the absence of ligand. In contrast, linkages identified with hexagons exhibit increased amounts of cleavage when thiM is incubated with TPP compared to incubation in the absence of ligand. The boxed nucleotides indicate pyrophosphate-recognition region (as described in text). FIG. 13B shows a model for the mechanism of thiM regulation. In the absence of TPP, the anti-SD sequence interacts with part of aptamer domain to form anti-anti-SD. As TPP is increased, aptamer-TPP complexes are formed and the anti-SD favors pairing with the SD. FIG. 13C shows the chemical structure and apparent dissociation constants for thiamine and TPP.

FIGS. 14A, 14B and 14C show putative eukaryote riboswitches. FIG. 14A shows the consensus TPP binding domain based on 100 bacteria and archaea RNAs (SEQ ID NO:18 and SEQ ID NOs:398-399). Nucleotides shown as lower case letters are most conserved (>90%). Open circles represent nucleotide positions and domains that vary in sequence and length are designated var. The consensus model is similar to that reported recently (Rodionov et al., 2002). FIG. 14B the TPP-binding domain of A. thaliana (SEQ ID NO:14). Variations in O. sativa (nucleotides enclosed in a circle) (SEQ ID NO:15) and P. secunda (nucleotides enclosed in a hexagon) (SEQ ID NO:16) are shown. FIG. 14C shows a putative TPP-binding domain in the intron of N. crassa (SEQ ID NO:17).

FIG. 15 shows sequence alignments of eukaryotic domains related to bacterial TPP-dependent riboswitches, Eco1, Eco2, Cac1, Ncr1, Aor1, Fox1, Fso1, Ath1, Pse1, Osa1, which are represented by SEQ ID NOs:19-28 respectively. Base paired stems are shaded in black and labeled as defined in Example 2). The P3 sequences, which in eukaryotes are significantly expanded in length and number of base pairs, are represented as a stem-loop structure. The highly conserved nucleotide positions in bacteria that were used to search for eukaryotic domains are enclosed in a box. For each identified (ID) sequence, the position of the conserved CUGAGA sequence within the given Genbank entry is given along with the accession identification, sequence name, and gene identification. Additional protein annotations based on sequence similarity are shown in brackets. Methods: Riboswitch-like domains were initially identified by sequence similarity to bacterial sequences (Eco2 and Cac) by a blastn search of Genbank using default parameters. These hits were verified and expanded by searching for degenerate matches to the pattern (CTGAGA [200] ACYTGA [5] <<<GNTNNNNC>>>[5] CGNRGGRA) (SEQ ID NO:375). Angle brackets indicate base pairing and bracketed numbers are variable gaps with constrained maximum lengths. All of the eukaryotic sequences have one or zero mismatches to this pattern except for one (Aor) that initially had three mismatches due to a single A insertion in the final search element. This mutation was removed to simplify the alignment. Comparison of mRNA (M33643.1) and genomic (AB033416.1) sequences demonstrated that the F. oxysporum element is in an intron in the 5′ UTR of the sti35 gene. Other fungal sequences (Ncr, Aor, and Fso) are flanked by consensus splicing sequences.

FIGS. 16A and 16B show the structural probing of the putative TPP-riboswitch from Arabidopsis. FIG. 16A shows the fragmentation pattern of the 128-nucleotide RNA (arrow) of A. thaliana (FIG. 14B) which was generated by incubation in the absence (−) or presence (+) of 100 μM TPP. T1, −OH and NR identify RNAs that were partially digested with RNase T1 (cleaves 3′ to G residues), alkali, or were not reacted, respectively. Reactions were conducted as described in Example 2. FIG. 16B shows the apparent KD for TPP binding by the A. thaliana RNA. Fraction bound was determined by in-line probing as described in Examples 1-3.

FIG. 17 shows genetic structures thiamine biosynthetic genes and possible mechanisms of riboswitch control. The location and mechanism of the E. coli and B. subtilis riboswitches are detailed in Examples 2 and 6. The putative TPP riboswitch from P. secunda resides immediately upstream from the polyA tail in the cDNA clone of the THIC gene. The putative TPP riboswitch domain in F. oxysporum is located in a 5′-UTR intron of the STI35 gene according to the genomic sequence but is absent in the cDNA clone.

FIGS. 18A and 18B show the L box—a highly conserved sequence and structural domain is present in the 5′-UTRs of Gram-positive and Gram-negative bacterial mRNAs that are related to lysine metabolism. Conserved portions of the L box sequence and secondary structure were identified by alignment of 28 representative mRNAs as noted. Base pairing potential representing P1 through P5 are enumerated and set off by boxes. Nucleotides shown as lower case letters are conserved in greater than 80% of the examples. The asterisk identifies the representative (B. subtilis lysC 5′-UTR) that was examined in this study. Gene names are as annotated in GenBank or were derived by protein sequence similarity. Organism abbreviations are as follows: Bacillus anthracis (BA), Bacillus halodurans (BH), Bacillus subtilis (BS), Clostridium acetobutylicum (CA), Clostridium perfringens (CP), Escherichia coli (EC), Haemophilus influenzae (HI), Oceanobacillus iheyensis (OI), Pasteurella multocida (PM), Staphylococcus aureus (SA), Staphylococcus epidermidis (SE), Shigella flexneri (SF), Shewanella oneidensis (SO), Thermatoga maritima (TM), Thermoanaerobacter tengcongensis (TT), Vibrio cholerae (VC), Vibrio vulnificus (VV), Thermoanaerobacter tengcongensis (TE).

FIGS. 19A (SEQ ID NO:60 and SEQ ID NOs:400-408), 19B and 19C (SEQ ID NO:61) show the consensus L box motif from the lysC 5′-UTR of B. subtilis undergoes allosteric rearrangement in the presence of L-lysine. (A) Consensus sequence and structure of the L box domain as derived using a phylogeny of 31 representative sequences from prokaryotic and archaeal organisms (FIG. 18) BA 0845, BA lysA, BA lysP, BH dapA, BH lysC, BH nhaC, BS lysC, BX lysC, CA lysA, CP lysA, CP lysP, EC lysC, HI nhaC, OI dapA, OI nhaC, PM nhaC, SA lysC, SA lysP, SE lysC, SE lysP, SF lysC, SO lysC, SO nhaC, TM asd, TT lysA, TT pspF, VC lysC, VC nhaC, VC nhaC, VV lysC, VV nhaC, which are represented by SEQ ID NOs:29-59, respectively. Nucleotides depicted a lower case a, c, t, or g, are present in at least 80% of the representatives, open circles identify nucleotide positions of variable identity, and dashed lines denote variable nucleotide identity and chain length. FIG. 19B shows sequence, secondary structure model, and lysine-induced structural modulation of the lysC 5′-UTR of B. subtilis. An additional 94 nucleotides (not depicted) reside between nucleotide 237 and the AUG start codon. Structural modulation sites (nucleotides enclosed in squares) were established using 237 lysC RNA by monitoring spontaneous RNA cleavage as depicted in C. FIG. 19C shows in-line probing of the 237 lysC RNA reveals lysine-induced modulation of RNA structure. Patterns of spontaneous cleavage, revealed by product separation using denaturing 10% polyacrylamide gel electrophoresis (PAGE), are altered at four major sites (denoted 1 through 4) in the presence (+) of 10 μM L-lysine (L) relative to that observed in the absence (−) of lysine. T1, −OH and NR represent partial digest with RNase T1, partial digest with alkali, and no reaction, respectively. Selected bands in the T1 lane (G-specific cleavage) are identified by nucleotide position. See Methods for experimental details.

FIGS. 20A, 20B, 20C, 20D and 20E show the molecular recognition characteristics of the lysine aptamer and the use of caged lysine. FIG. 20A shows the chemical structures of L-lysine, D-lysine and nine closely-related analogs. Small circles represent chiral carbon centers wherein the enantiomeric configuration is defined for each compound. Encircled atoms identify chemical differences between L-lysine and the analog depicted. FIG. 20B shows in-line probing analysis of the 179 lysC RNA in the absence (−) of ligand, or in the presence of 10 μM L-lysine or 100 μM of various analogs as indicated for each lane. For each lane, the relative extent of spontaneous cleavage at site 3 is compared to that of the zone of constant cleavage immediately below this site, where a cleavage ratio significantly below ˜1.5 reflects modulation. FIG. 20C shows a schematic representation of dipeptide digestion by hydrochloric acid. All dipetide forms are expected to be incapable of binding the lysine aptamer (inactive), while lysine-containing dipeptides should induce conformational changes in the aptamer (active) upon acid digestion. FIG. 20D shows in-line probing analysis of the 179 lysC RNA in the absence of lysine (−) or in the presence of various amino acids and dipeptides. Underlined lanes carry dipeptide preparations that were pretreated with HCl as depicted in a. FIG. 20E shows the fraction of spontaneous cleavage at site 3 in d is plotted after normalization to the extent of processing in the absence of added ligand.

FIGS. 21A, 21B, 21C and 21D show determination of the dissociation constant and stoichiometry for L-lysine binding to the 179 lysC RNA. FIG. 21A shows in-line probing with increasing concentrations of L-lysine ranging from 3 nM to 3 mM. Details are as defined for FIG. 19C. FIG. 20B shows a plot depicting the normalized fraction of RNA undergoing spontaneous cleavage versus the concentration of amino acid for sites 1 through 3. The dashed line identifies the concentration of L-lysine required to bring about half-maximal structural modulation, which indicates the apparent KD for ligand binding. FIG. 20C shows the 179 lysC RNA (10 μM) shifts the equilibrium of tritiated L-lysine (50 nM) in an equilibrium dialysis chamber. To investigate competitive binding, unlabeled L-(L) and D-lysine (D), or L-ornithine (5) were added to a final concentration of 50 μM each to one chamber of a pre-equilibrated assay as indicated. FIG. 21D shows a scatchard analysis of L-lysine binding by the 179 lysC RNA. The variable r represents the ratio of bound ligand concentration versus the total RNA concentration and the variable [LF] represents the concentration of free ligand.

FIGS. 22A, 22B and 22C show the B. subtilis lysC riboswitch and its mechanism for metabolite-induced transcription termination. FIG. 22A shows a sequence and repressed-state model for the lysC riboswitch secondary structure (SEQ ID NO:62). The encircled nucleotides identify the putative anti-terminator interaction that could form in the absence of L-lysine. Boxed nucleotides identify sites of disruption (M1) and compensatory mutations for the terminator stem (M2) and for the terminator and anti-terminator stems (M3). Nucleotides enclosed in squares identify some of the positions where mutations exhibit lysC derepression that were reported previously (Vold et al. 1975; Lu et al. 1992). FIG. 22B shows In vitro transcription assays conducted in the absence (−) or presence (+) of 10 mM L-lysine or other analogs as indicated. FL and T identify the full-length and terminated transcripts, respectively. The percent of the terminated RNAs relative to the total terminated and full-length transcripts are provided for each lane (% term.). FIG. 22C shows In vivo expression of a β-galactosidase reporter gene fused to wild-type (WT), G39A and G40A mutant lysC 5′-UTR fragments. Media conditions are as follows: I, normal medium (0.27 mM lysine); II, minimal medium (0.012 mM); III, lysine-supplemented minimal medium (1 mM); IV, lysine hydroxamate-supplemented (medium II plus 1 mM lysine hydroxamate) minimal media; V, thiosine-supplemented (medium II plus 1 mM thiosine) minimal medium.

FIG. 23 shows that a highly conserved domain is present in the 5′-UTR of certain gram-positive and gram-negative bacterial mRNAs. Depicted is an alignment of 32 representative mRNA domains from bacteria that conform to the G box consensus sequence BH1-guaA, BH2-[pbuG], BH3-purE, BH4-ssnA, BH5-[xpt], BS1-[pbuG], BS2-purE, BS3-xpt, BS4-yxjA, BS5-ydhL, CA1-uraA, CA2-[pbuG], CA3-guaB, CP1-xpt, CP2-uapC, CP3-guaB, CP4-add, FN1-purQ, LL1-xpt, LM1-[pbuG], LM2-[xpt], OI1-guaA, OI2-[pbuG], OI3-purE, OI4-[xpt], SA1-xpr, TSE1-[xpt], STA1-xpt, STPY1-xpt, STPN-xpt, TE1-[pbuG], VV1-add, which are represented by SEQ ID NOs:63-94 respectively. Enclosed and enumerated regions identify base-pairing potential of stems P1, P2, and P3, respectively. Nucleotides shown as lower case letters are conserved in greater than 90% of the examples. The asterisk identifies the representative (xpt-pbuX 5′-UTR) that was examined in this study. It is important to note that three representatives (BS5, CP4 and VV1) that carry a C to U mutation in the conserved core (in the P3-P1 junction) appear to be adenine-specific riboswitches (unpublished observations). Gene names are as annotated in GenBank, the SubtiList database, or based on protein similarity searches (brackets). Organisms abbreviations are as follows: Bacillus halodurans (BH), Bacillus subtilis (BS), Clostridium acetobutylicum (CA), Clostridium perfringens (CP), Fusobacterium nucleatum (FN), Lactococcus lactis (LL), Listeria monocytogenes (LM), Oceanobacillus iheyensis (OI), Staphylococcus aureus (SA), Staphylococcus epidermidis (SE), Streptococcus agalactiae (STA), Streptococcus pyogenes (STPY), Streptococcus pneumoniae (STPN), Thermoanaerobacter tengcongensis (TE), and Vibrio vulnificus (VV).

FIGS. 24A, 24B and 24C show the G box RNA of the xpt-pbuX mRNA in B. subtilis responds allosterically to guanine FIG. 24A shows the consensus sequence and secondary model for the G box RNA domain that resides in the 5′ UTR of genes that are largely involved in purine metabolism (SEQ ID NO:95). Phylogenetic analysis is consistent with the formation of a three-stem (P1 through P3) junction. Nucleotides depicted shown as lower case letters and capitals are present in greater than 90% and 80% of the representatives examined, respectively (FIG. 23). Encircled nucleotides exhibit base complementation, which might indicate the formation of a pseudoknot. FIG. 24B shows sequence and ligand-induced structural alterations of the 5′-UTR of the xpt-pbuX transcriptional unit (SEQ ID NO:96). The putative anti-terminator interaction is represented by the boxes. Nucleotides that undergo structural alteration as determined by in-line probing (from C) are identified with squares. The 93 xpt fragment (boxed) of the 201 xpt RNA retains guanine-binding function. Asterisks denote alterations to the RNA sequence that facilitate in vitro transcription (5′ terminus) or that generate a restriction site (3′ terminus). Nucleotide numbers begin at the first nucleotide of the natural transcription start site. The translation start codon begins at position 186. FIG. 24C shows guanine and related purines selectively induce structural modulation of the 93 xpt mRNA fragment. Precursor RNAs (Pre; 5′ 32P-labeled) were subjected to in-line probing by incubation for 40 hr in the absence (−) or presence of guanine, hypoxanthine, xanthine and adenine as indicated by G, H, X and A, respectively. Lanes designated NR, T1 and −OH contain RNA that was not reacted, subjected to partial digestion with RNase T1 (G-specific cleavage), or subjected to partial alkaline digestion, respectively. Selected bands corresponding to G-specific cleavage are identified. Regions 1 through 4 identify major sites of ligand-induced modulation of spontaneous RNA cleavage.

FIGS. 25A and 25B show the 201 xpt mRNA Leader Binds Guanine with High Affinity. FIG. 25A shows in-line probing reveals that spontaneous RNA cleavage of the 201 xpt RNA at four regions decreases with increasing guanine concentrations. Only those locations of the PAGE image corresponding to the four regions of modulation as indicated in FIG. 25C are depicted. Other details and notations are as described in the legend to FIG. 25C. FIG. 25B shows a plot depicting the normalized fraction of RNA that experienced spontaneous cleavage versus the concentration of guanine for modulated regions 1 through 4 in FIG. 25A. Fraction cleaved values were normalized to the maximum cleavage measured in the absence of guanine and to the minimum cleavage measured in the presence of 10 μM guanine. The apparent KD value (less than or equal to 5 nM) reflects the limits of detection for these assay conditions.

FIGS. 26A, 26B and 26C show a molecular discrimination by the guanine-binding aptamer of the xpt-pbuX mRNA. FIG. 26A shows the chemical structures and apparent KD values for guanine, hypoxanthine and xanthine (active natural regulators of xpt-pbuX genetic expression in B. subtilis) versus that of adenine (inactive). Differences in chemical structure relative to guanine are encircled. KD values were established as shown in FIG. 26 with the 201 xpt RNA. Numbers on guanine represent the positions of the ring nitrogen atoms. FIG. 26B shows chemical structures and KD values for various analogs of guanine reveal that all alterations of this purine cause a loss of binding affinity. Open circles identify KD values that most likely are significantly higher than indicated, as concentrations of analog above 500 μM were not examined in this analysis. The apparent KD values of G, H, X and A as indicated are plotted as triangles for comparison. FIG. 26C shows a schematic representation of the molecular recognition features of the guanine aptamer in 201 xpt. Hydrogen bond formation at position 9 of guanine is expected because guanosine (KD>100 μM) and inosine (KD>100 μM), which are 9-ribosyl derivatives of guanine and hypoxanthine, respectively, do not exhibit measurable binding (see FIG. 27).

FIGS. 27A and 27B show confirmation of guanine binding specificity by equilibrium dialysis. FIG. 27A shows an equilibrium dialysis strategy was used to confirm that in vitro-transcribed 93 xpt RNAs bind to guanine and can discriminate against various analogs. Each data point was generated by adding 3H-guanine to chamber a, which is separated from RNA and other analogs by a dialysis membrane with a molecular weight cut-off (MWCO) of 5,000 daltons. Left: If no guanine binding sites are present in chamber b, or if an excess of unlabeled competitor is present, then no shift in the distribution of tritium is expected. Right: If an excess of guanine-binding RNAs are present in chamber b, and if no competitor is present, then a substantial shift in the distribution of tritium towards chamber b is expected. FIG. 27B shows the 93 xpt RNA can shift the distribution of 3H-guanine in an equilibrium dialysis apparatus, while analogs of guanine are poor competitors. The plot depicts the fraction of counts per minute (cpm) of tritium in chamber b relative to the total amount of cpm counted from both chambers. A value of ˜0.5 is expected if no shift occurs, as is the case when RNA is absent (none), or in the presence of excess unlabeled competitor (G). A value approaching 1 is expected if the majority of 3H-guanine is bound by the RNA in chamber b in the absence (−) of unlabeled analog, or in the presence of unlabeled analogs that do not serve as effective competitors under the assay conditions (100 nM 3H-guanine, 300 nM RNA, 500 nM analog). Ino and Gua represents inosine and guanosine, respectively.

FIGS. 28A, 28B, 28C and 28D show the binding and genetic control functions of variant guanine riboswitches. FIG. 28A shows mutations used to examine the importance of various structural features of the guanine aptamer domain (SEQ ID NO:97). FIG. 28B shows examination of the binding function of aptamer variants by equilibrium dialysis. WT designates the wild-type 93 xpt construct. Details are as described for FIG. 27. FIG. 28C shows genetic modulation of a β-galactosidase reporter gene upon the introduction of various purines as indicated. FIG. 28D shows regulation of β-galactosidase reporter gene expression by WT and mutants M1 through M7. Open and filled bars represent enzyme activity generated when growing cells in the absence and presence of guanine, respectively.

FIGS. 29A, 29B and 29C show that riboswitches participate in fundamental genetic control. FIGS. 29A and 29B are schematic representations of the seven known riboswitches and the metabolites they sense. The secondary structure models were obtained as follows: coenzyme B12 (see Example 1); TPP (see Example 2); FMN (see Example 3), SAM (see Example 7); guanine (see Example 6); lysine (see Example 5); adenine (see Example 8). Coenzyme B12 is depicted in exploded form wherein a, b and c designate covalent attachment sites between fragments. FIG. 29C shows a genetic map of B. subtilis riboswitch regulons and their positions on the bacterial chromosome. Genes are controlled by riboswitches as identified by matching numbers. All nomenclature is derived from the SubtiList database release R16.1 (Moszer, I., et al., 1995, Microbiol. 141, 261-268) except for metI and metC, which are recent designations (Auger, S., et al., 2002, Microbiol. 148, 507-518).

FIGS. 30A, 30B and 30C show the S Box is a structured RNA domain that binds SAM. (A) Consensus sequence and secondary-structure model of the S box domain derived from 107 bacterial representatives (SEQ ID NO:98 and SEQ ID NOs:409-410). Lower case letter and capital letter positions identify nucleotides whose identity as depicted is conserved in greater than 90% or 80% of the representative S box RNAs, respectively. R, Y, and N represent purine, pyrimidine, and any nucleotide, respectively. P1 through P4 identify conserved base pairing. Enclosed nucleotides identify a putative pseudoknot interaction. FIG. 30B shows a sequence and secondary structure model for the 251 yitJ mRNA fragment (SEQ ID NO:99). Sites of structural modulation upon introduction of SAM are depicted as described. Nucleotide 1 corresponds to the putative transcriptional start site. Asterisks identify nucleotides that were added to the construct to permit efficient transcription in vitro. The first nucleotide of the AUG start codon is 212 (not shown). Other notations are as described in a. FIG. 30C shows the spontaneous cleavage patterns of 251 yitJ (˜1 nM 5′ 32P-labeled) RNA incubated for ˜40 hr at 25° C. in 50 mM Tris-HCl (pH 8.3 at 25° C.), 20 mM MgCl2, 100 mM KCl, and without (−) or with methionine or SAM as indicated for each lane. NR, T1 and −OH represent no reaction, partial digest with RNase T1, and partial digest with alkali, respectively. Certain fragment bands corresponding to T1 digestion (cleaves after G residues) are depicted. Arrowheads identify positions of significant modulation of spontaneous cleavage, and the numbered sites were used for quantitation (see FIG. 31B). Experimental procedures are similar to those described in Examples 1-3.

FIGS. 31A, 31B and 31C show the binding affinity and molecular discrimination by a SAM-binding RNA. FIG. 31A shows the chemical structures of various compounds used to probe the binding characteristics of the SAM yitJ riboswitch. Other than methionine, each compound as depicted is coupled to an adenosyl moiety ([A]; inset) coupled via the 5′ carbon (as signified by R). FIG. 31B Left: The KD of 251 yitJ for SAM was determined by plotting the normalized fraction of RNA cleaved at regions 1 through 6 (see FIG. 30C) versus the logarithm of the concentration of SAM in molar units. The dashed line indicates the concentration needed to induce half maximal modulation of cleavage activity. Right: KD values for SAM and various analogs as determined by this method. FIG. 31C shows molecular discrimination determined by equilibrium dialysis. Assays employed 100 nM of S-adenosyl-L-methionine-methyl-3H (3H-SAM; 14.5 μCi mmol−1; 7,000 cpm) added to side A of an equilibrium dialysis chamber (1, 2), and were conducted in the absence (none) or the presence of 3 μM RNA on the B side of the chamber as indicated. Equilibrations were carried out for ˜10 hr in the absence (−) of unlabeled analogs, and then were subsequently incubated in the presence of 25 μM unlabeled compounds (added to side B) as indicated. M1 is a variant of 124 yitJ that carries disruptive mutations in the junction between stems P1 and P2 (FIG. 32a). Line at a cpm ratio of 1 identifies the bar height expected if a shift in 3H-SAM has not occurred. Additional experimental details are similar to those described in Examples 1 and 2.

FIGS. 32A, 32B and 32C show the effects of RNA mutations on SAM binding and genetic control. FIG. 32A shows the sequence and secondary structure model for the 124 yitJ RNA (SEQ ID NO:100). Mutations M1 through M9 were generated in plasmids containing fusions of the yit J5′-UTR upstream from a lacZ reporter gene. Templates for preparation of mutant RNAs for in vitro studies were then created by PCR, and the mutant DNA constructs were integrated into the chromosome for in vivo studies. See Methods for experimental details. FIG. 32B shows the analysis of SAM-binding function by equilibrium dialysis in the presence of wild-type (WT) and mutant RNAs as denoted. Details are described in the legend to FIG. 31C, except that 300 nM RNA was used and all assays were conducted without the addition of unlabeled analogs. FIG. 32C shows In vivo control of β-galactosidase expression in B. subtilis cells transformed with various riboswitch constructs as indicated. β-galactosidase activities were measured as described in Example 2. Cells were grown in glucose minimal media in 0.75 μg mL−1 methionine (−) 50 μg mL−1 methionine (+). M6 through M9 were not examined in vivo.

FIGS. 33A, 33B, 33C and 33D show metabolite-induced transcription termination of several mRNAs that carry a SAM riboswitch. FIG. 33A shows In vitro transcription using T7 RNA polymerase results in increased termination of four mRNA leader sequences. Reactions were conducted in the absence (−) or presence (+) of 50 μM of the effector as indicated for each lane. For example, the metl template includes the 5′ UTR and coding sequences through mRNA position 242, while the termination site is expected to occur at position 189. Below each gel is indicated the percentage of transcription termination (T) at the expected location relative the total amount of expected termination plus full length RNA (FL). FIGS. 33B-33D show sequence and structural model for the metl riboswitch in two structural states (SEQ ID NO:101). Residues shown in hexagons and squares correspond to the P1 (anti-anti-terminator) and the terminator stems, respectively. The encircled residues correspond to the anti- terminator stem. Sequences boxed in black define the location and identity of mutations used to examine the proposed mechanism of genetic control. Gel: Analysis of mutant metl riboswitches wherein disruptive (Ma, Mab and Mc) or the corresponding compensatory mutations (Mabc) have been inserted. The metl mutant templates and wild-type control template (WT) are identical to the templates used in A, except that the FL product is 220 nucleotides. Other notations are as describe in A.

FIGS. 34A and 34B show Bacilli species subtilis and anthrasis bind SAM with different affinities. FIG. 34A shows structural modulation of the B. subtilis cysH aptamer as determined by in-line probing (SEQ ID NO:102). Inset: Apparent KD values determined by monitoring structural modulation over a range of SAM or SAM analog concentrations. Two G residues (asterisks) were included at the 5′ terminus of the RNA construct to facilitate in vitro transcription. Nucleotide numbers are given relative to the putative transcription start site. In-line probing was conducted with an RNA extending to nucleotide 117, while the remainder of the RNA is shown to depict the putative transcription terminator stem. Experiments were similar to those described in FIG. 30B and FIG. 31B. See the legend for FIG. 30B for details. FIG. 34B shows structural modulation of the B. subtilis cysH aptamer as determined by in-line probing (SEQ ID NO:103). The transcription start point of the B. anthracis cysH mRNA has not been determined, and so numbering of nucleotides begins immediately after the two inserted G residues (asterisks). In-line probing was conducted with an RNA extending to nucleotide 112. See A for additional details.

FIGS. 35A, 35B and 35C show guanine- and adenine-specific riboswitches. FIG. 35A shows sequence and structural features of the two guanine-specific (purE and xpt) and three adenine-specific aptamer domains that are examined in this study BS2-purE, BS3-xpt, BS5-ydhL, CP4-add, VV1-add, which are represented by SEQ ID NOs:104-108, respectively. P1 through P3 identify the three base-paired stems comprising the secondary structure of the aptamer domain. Lowercase nucleotides identify positions whose base identity is conserved in greater than 90% of representatives in the phylogeny1. The arrow identifies a nucleotide within the conserved core of the aptamer that is a determinant of ligand specificity. BS, CP and VV designate B. subtilis, Clostridium perfringens and Vibrio vulnificus, respectively. FIG. 35B shows sequence and secondary structure of the xpt and ydhL aptamers (SEQ ID NO:109). Encircled nucleotides identify positions within the ydhL aptamer that differ from those in the xpt aptamer. The sequence disclosed in FIG. 35C is SEQ ID NO:110. Nucleotides in xpt are numbered as described in Example 6. Other notations are as described in A.

FIGS. 36A, 36B, 36C, 36D and 36E show the ligand specificity of five G box RNAs. (A through E) In-line probing assays for the conserved aptamer domains as labeled. NR, T1 and −OH identify marker lanes wherein precursor RNAs (Pre) were not incubated, or were partially digested with RNase T1 or alkali, respectively. Selected bands corresponding to RNase T1 digestion (cleavage 3′ relative to guanidyl residues) are labeled for each RNA. RNAs were incubated for 40 hr in the absence of ligand (−), or in the presence of 1 μM guanine (G) or adenine (A). Large arrowheads identify sites of substantial change in cleavage pattern that is due to the addition of a particular ligand. See Methods for additional details.

FIGS. 37A and 37B show the binding affinity of the ydhL aptamer for adenine. FIG. 37A shows the in-line probing assay for the 80 ydhL RNA at various concentrations of adenine. For each lane, sites 1 through 4 were quantitated and the fraction of RNA cleaved was used to determine the apparent KD. FIG. 37B shows a plot of the normalized fraction of RNA that has undergone spontaneous cleavage at sites 1 through 4 versus the concentration of adenine. See Example 8 for additional details.

FIGS. 38A and 38B show the specificity of molecular recognition by the adenine aptamer from ydhL. FIG. 38A Top: Chemical structures of adenine, guanine and other purine analogs that exhibit measurable binding to the 80 ydhL RNA. Chemical changes relative to 2,6-DAP, which is the tightest-binding compound, are encircled. Bottom left: Plot of the apparent KD values for various purines. Bottom right: Model for the chemical features on adenine that serve as molecular recognition contacts for ydhL. Note that the importance of N7 and N9 has not been determined. Encircled arrow indicated that a contact could exist if a hydrogen bond donor is appended to C2. FIG. 38B shows chemical structures of various purines that are not bound by the 80 ydhL RNA (KD values poorer than 300 μM).

FIGS. 39A, 39B, 39C and 39D show interconversion of guanine- and adenine-specific aptamers. FIG. 39A Left: Plot of the normalized fraction of wild-type 93 xpt RNA cleavage product for a given site versus the logarithm of the concentration of ligand present during incubation in an in-line probing assay. Cleavage products monitored for modulation correspond to site 3 (FIG. 37A). Right: Plot of the fraction of the total counts per minute (cpm) present in chamber B relative to the total counts per minute from sides A and B of an equilibrium dialysis chamber. Value of ˜0.5 indicate an equal distribution of ligand (no binding) while values of ˜1 indicate that most of the ligand is bound to the RNA within side B of the chamber. (B, C, D) In-line probing plots and equilibrium dialysis plots for 93 xpt (C to U mutation), 80 ydhL, and 80 ydhL (U to C mutation), respectively. Details are describe in a, or are described in the Example 8.

FIGS. 40A, 40B, 40C, 40D and 40E show a model for the genetic control of ydhL by an adenine riboswitch and its function as a gene-activating element. FIG. 40A sequence of the adenine riboswitch from B. subtilis ydhL and secondary structure models for the ‘ON’ and ‘OFF’ states for gene regulation (SEQ ID NO:111). FIG. 40B In vivo function of the wild-type ydhL riboswitch and of a variant form as determined by fusion to a β-galactosidase reporter gene.

FIGS. 41A-41BA show the sequence and types of riboswitches Bs01, Bs02, Bs03, Bs04, Bs05, Bs06, Bs07, Bs08, Bs09, Bs10, Bs11, Bh01, Bh02, Bh03, Bh04, Bh05, Oi01, Oi02, Oi03, Oi04, Oi05, Oi06, Oi07, Oi10, Oi08, Oi09, Oi10, Oi11, Oi12, Oi13, Ca01, Ca02, Ca03, Ca04, Ca05, Ca06, Ca07, Cp01, Cp02, Lm01, Lm02, Lm03, Lm04, Lm05, Lm06, Lm07, Li01, Li02, Li03, Li04, Li05, Li06, Li07, Sa01, Sa02, Sa03, Sa04, Sc01, Ct01, Tt01, Tt02, Tt03, Fn01, Fn02, Dr01, Dr02, Xa01, Xc01, Se01, Se02, Gs01, Gs02, Ba01, Ba02, Ba03, Ba04, Ba05, Ba06, Ba07, Ba08, Ba09, Ba10, Ba11, Ba12, Ba13, Ba14, Ba15, Ba16, Ba17, Bc01, Bc02, Bc03, Bc04, Bc05, Bc06, Bc07, Bc08, Bc09, Bc10, Bc11, Bc12, Bc13, Bc14, Bc15, Bc16, Bc17, Bc18, Atu01, Atu02, Atu03, Atu04, Atu05, Atu06, Bha01, Bha02, Bha03, Bha04, Bsu01, Bja01, Bja02, Bja03, Bja04, Bja05, Bme01, Bme02, Bme03, Bme04, Ccr01, Ccr02, Cte01, Cte02, Cte03, Cte04, Cte05, Cac01, Cac02, Cpe01, Cpe02, Cpe03, Cpe04, Eco01, Fnu01, Lig01, Lmo01, Mlo01, Mlo02, Mlo03, Mlo04, Mlo05, Mlo06, Mle01, Mtu01, Mtu02, Pae01, Pae02, Pae03, Pae04, Ppu01, Ppu02, Ppu03, Ppu04, Rso01, Sme01, Sme02, Sme03, Sme04, Sme05, Sco01, Sco02, Sco03, Sco04, Sco05, Sfl01, Son01, Son02, Sti01, Sti02, Tma01, Tte01, Tte02, Veh01, Vvu01, Xac01, Xax01, Ype01, Aca01, Avi01, Bfr01, Bmg01, Lma01, Pfr01, Rca01, Rca02, Rca03, Rsp01, Sbi01, Sgi01, Svi01, Zmo01, Zmo02, NC—002570.1/648448-648540, NC—002570.1/650317-650406, NC—002570.1/676483-676572, NC—002570.1/806882-806965, NC—002570.1/1593067-1592976, NC—000964.1/693955-694038, NC—000964.1/697886-697976, NC—000964.1/2319120-2319031, NC—000964.1/4004319-4004410, NC—003030.1/1002184-1002270, NC—003030.1/2904259-2904168, NC—003030.1/2824539-2824454, NC—003366.1/422828-422924, NC—003366.1/512410-512323, NC—003366.1/2617892-2617807, NC—003454.1/1645257-1645173, NC—002662.1/1159519-1159604, NC—003210.1/610773-610679, NC—003210.1/1958601-1958511, NC—004193.1/760480-760571, NC—004193.1/769695-769781, NC—004193.1/786775-786863, NC—004193.1/1103947-1104044, NC—002745.1/430771-430861, NC—004461.1/2432384-2432294, NC—004116.1/1093950-1093860, NC—002737.1/930757-930842, NC—003028.1/1754791-1754878, NC—003869.1/586372-586463, NC—000964.1/626134-626051, NC—003366.1/2870819-2870732, NC—004460.1/504378-504467, Bha_LysC, Bha_dapA, Bha_nhaC, Bsu_LysC, Cac_lysA, Cpe_nhaC, Cpe_lysA, Cpe_lysP, Eco_lysC, Hin_nhaC, Oih_dapA, Oih_nhaC, Pmu_nhaC, Sau_lysC, Sau_lysP, Sep_lysC, Sep_lysP, Sfl_lysC, Son_lysC, Son_nhaC, Tma_asd, Tte_lysA, Tte_pspF, Vch_lysC, Vch_nhaC, Vch_nhaC, 2Vvu_lysC, Vvu_nhaC, Cons, Cons and Consensus, which are represented by SEQ ID NOs:112-374, respectively.

DETAILED DESCRIPTION

OF THE INVENTION

The disclosed methods and compositions can be understood more readily by reference to the following detailed description of particular embodiments and the Example included therein and to the Figures and their previous and following description.

Certain natural mRNAs serve as metabolite-sensitive genetic switches wherein the RNA directly binds a small organic molecule. This binding process changes the conformation of the mRNA, which causes a change in gene expression by a variety of different mechanisms. Modified versions of these natural “riboswitches” (created by using various nucleic acid engineering strategies) can be employed as designer genetic switches that are controlled by specific effector compounds (referred to herein as trigger molecules). The natural switches are targets for antibiotics and other small molecule therapies. In addition, the architecture of riboswitches allows actual pieces of the natural switches to be used to construct new non-immunogenic genetic control elements, for example the aptamer (molecular recognition) domain can be swapped with other non-natural aptamers (or otherwise modified) such that the new recognition domain causes genetic modulation with user-defined effector compounds. The changed switches become part of a therapy regimen—turning on, or off, or regulating protein synthesis. Newly constructed genetic regulation networks can be applied in such areas as living biosensors, metabolic engineering of organisms, and in advanced forms of gene therapy treatments.

Messenger RNAs are typically thought of as passive carriers of genetic information that are acted upon by protein- or small RNA-regulatory factors and by ribosomes during the process of translation. It was discovered that certain mRNAs carry natural aptamer domains and that binding of specific metabolites directly to these RNA domains leads to modulation of gene expression. Natural riboswitches exhibit two surprising functions that are not typically associated with natural RNAs. First, the mRNA element can adopt distinct structural states wherein one structure serves as a precise binding pocket for its target metabolite. Second, the metabolite-induced allosteric interconversion between structural states causes a change in the level of gene expression by one of several distinct mechanisms. Riboswitches typically can be dissected into two separate domains: one that selectively binds the target (aptamer domain) and another that influences genetic control (expression platform). It is the dynamic interplay between these two domains that results in metabolite-dependent allosteric control of gene expression.

As disclosed herein, distinct classes of riboswitches have been identified and are shown to selectively recognize activating compounds (referred to herein as trigger molecules). For example, coenzyme B12, thiamine pyrophosphate (TPP), and flavin mononucleotide (FMN) activate riboswitches present in genes encoding key enzymes in metabolic or transport pathways of these compounds. The aptamer domain of each riboswitch class conforms to a highly conserved consensus sequence and structure. Thus, sequence homology searches can be used to identify related riboswitch domains. Riboswitch domains have been discovered in various organisms from bacteria, archaea, and eukarya.

One class of riboswitches that recognizes guanine and discriminates against most other purine analogs has been discovered. Representative RNAs that carry the consensus sequence and structural features of guanine riboswitches are located in the 5′-untranslated region (UTR) of numerous genes of prokaryotes, where they control expression of proteins involved in purine salvage and biosynthesis. Three representatives of this phylogenetic collection bind adenine with values for apparent dissociation constant (apparent KD) that are several orders of magnitude better than for guanine The preference for adenine is due to a single nucleotide substitution in the core of the riboswitch, wherein each representative most likely recognizes its corresponding ligand by forming a Watson/Crick base pair. In addition, the adenine-specific riboswitch associated with the ydhL gene of Bacillus subtilis functions as a genetic ‘ON’ switch, wherein adenine binding causes a structural rearrangement that precludes formation of an intrinsic transcription terminator stem. Guanine-sensing riboswitches are a class of RNA genetic control elements that modulate gene expression in response to changing concentrations of this compound.

It was discovered that the 5′-untranslated sequence of the Escherichia coli btuB mRNA assumes a more proactive role in metabolic monitoring and genetic control. The mRNA serves as a metabolite-sensing genetic switch by selectively binding coenzyme B12 without the need for proteins. This binding event establishes a distinct RNA structure that is likely to be responsible for inhibition of ribosome binding and consequent reduction in synthesis of the cobalamin transport protein BtuB. This discovery, along with related observations described herein, supports the hypothesis that metabolic monitoring through RNA-metabolite interactions is a widespread mechanism of genetic control.

RNA structure probing data indicate that the thiamine pyrophosphate (TPP) riboswitch operates as an allosteric sensor of its target compound, wherein binding of TPP by the aptamer domain stabilizes a conformational state within the aptamer and within the neighboring expression platform that precludes translation. The diversity of expression platforms appears to be expansive. The thiM RNA uses a Shine-Dalgarno (SD)-blocking mechanism to control translation. In contrast, the thiC RNA controls gene expression both at transcription and translation, and therefore might make use of a somewhat more complex expression platform that converts the TPP binding event into a transcription termination event and into inhibition of translation of completed mRNAs.

A. General Organization of Riboswitch RNAs

Bacterial riboswitch RNAs are genetic control elements that are located primarily within the 5′-untranslated region (5′-UTR) of the main coding region of a particular mRNA. Structural probing studies (discussed further below) reveal that riboswitch elements are generally composed of two domains: a natural aptamer (T. Hermann, D. J. Patel, Science 2000, 287, 820; L. Gold, et al., Annual Review of Biochemistry 1995, 64, 763) that serves as the ligand-binding domain, and an ‘expression platform’ that interfaces with RNA elements that are involved in gene expression (e.g. Shine-Dalgarno (SD) elements; transcription terminator stems). These conclusions are drawn from the observation that aptamer domains synthesized in vitro bind the appropriate ligand in the absence of the expression platform (see Examples 2, 3 and 6). Moreover, structural probing investigations suggest that the aptamer domain of most riboswitches adopts a particular secondary- and tertiary-structure fold when examined independently, that is essentially identical to the aptamer structure when examined in the context of the entire 5′ leader RNA. This implies that, in many cases, the aptamer domain is a modular unit that folds independently of the expression platform (see Examples 2, 3 and 6).

Ultimately, the ligand-bound or unbound status of the aptamer domain is interpreted through the expression platform, which is responsible for exerting an influence upon gene expression. The view of a riboswitch as a modular element is further supported by the fact that aptamer domains are highly conserved amongst various organisms (and even between kingdoms as is observed for the TPP riboswitch), (N. Sudarsan, et al., RNA 2003, 9, 644) whereas the expression platform varies in sequence, structure, and in the mechanism by which expression of the appended open reading frame is controlled. For example, ligand binding to the TPP riboswitch of the tenA mRNA of B. subtilis causes transcription termination (A. S. Mironov, et al., Cell 2002, 111, 747). This expression platform is distinct in sequence and structure compared to the expression platform of the TPP riboswitch in the thiM mRNA from E. coli, wherein TPP binding causes inhibition of translation by a SD blocking mechanism (see Example 2). The TPP aptamer domain is easily recognizable and of near identical functional character between these two transcriptional units, but the genetic control mechanisms and the expression platforms that carry them out are very different.

Aptamer domains for riboswitch RNAs typically range from ˜70 to 170 nt in length (FIG. 11). This observation was somewhat unexpected given that in vitro evolution experiments identified a wide variety of small molecule-binding aptamers, which are considerably shorter in length and structural intricacy (T. Hermann, D. J. Patel, Science 2000, 287, 820; L. Gold, et al., Annual Review of Biochemistry 1995, 64, 763; M. Famulok, Current Opinion in Structural Biology 1999, 9, 324). Although the reasons for the substantial increase in complexity and information content of the natural aptamer sequences relative to artificial aptamers remains to be proven, this complexity is most likely required to form RNA receptors that function with high affinity and selectivity. Apparent KD values for the ligand-riboswitch complexes range from low nanomolar to low micromolar. It is also worth noting that some aptamer domains, when isolated from the appended expression platform, exhibit improved affinity for the target ligand over that of the intact riboswitch. (˜10 to 100-fold) (see Example 2). Presumably, there is an energetic cost in sampling the multiple distinct RNA conformations required by a fully intact riboswitch RNA, which is reflected by a loss in ligand affinity. Since the aptamer domain must serve as a molecular switch, this might also add to the functional demands on natural aptamers that might help rationalize their more sophisticated structures.

B. Riboswitch Regulation of Transcription Termination in Bacteria

Bacteria primarily make use of two methods for termination of transcription. Certain genes incorporate a termination signal that is dependent upon the Rho protein, (J. P. Richardson, Biochimica et Biophysica Acta 2002, 1577, 251). while others make use of Rho-independent terminators (intrinsic terminators) to destabilize the transcription elongation complex (I. Gusarov, E. Nudler, Molecular Cell 1999, 3, 495; E. Nudler, M. E. Gottesman, Genes to Cells 2002, 7, 755). The latter RNA elements are composed of a GC-rich stem-loop followed by a stretch of 6-9 uridyl residues. Intrinsic terminators are widespread throughout bacterial genomes (F. Lillo, et al., 2002, 18, 971), and are typically located at the 3′-termini of genes or operons. Interestingly, an increasing number of examples are being observed for intrinsic terminators located within 5′-UTRs.

Amongst the wide variety of genetic regulatory strategies employed by bacteria there is a growing class of examples wherein RNA polymerase responds to a termination signal within the 5′-UTR in a regulated fashion (T. M. Henkin, Current Opinion in Microbiology 2000, 3, 149). During certain conditions the RNA polymerase complex is directed by external signals either to perceive or to ignore the termination signal. Although transcription initiation might occur without regulation, control over mRNA synthesis (and of gene expression) is ultimately dictated by regulation of the intrinsic terminator. Presumably, one of at least two mutually exclusive mRNA conformations results in the formation or disruption of the RNA structure that signals transcription termination. A trans-acting factor, which in some instances is a RNA (F. J. Grundy, et al., Proceedings of the National Academy of Sciences of the United States of America 2002, 99, 11121; T. M. Henkin, C. Yanofsky, Bioessays 2002, 24, 700) and in others is a protein (J. Stulke, Archives of Microbiology 2002, 177, 433), is generally required for receiving a particular intracellular signal and subsequently stabilizing one of the RNA conformations. Riboswitches offer a direct link between RNA structure modulation and the metabolite signals that are interpreted by the genetic control machinery. A brief overview of the FMN riboswitch from a B. subtilis mRNA is provided below to illustrate this mechanism.

It was discovered that certain mRNAs involved in thiamine biosynthesis bind to thiamine (vitamin B1) or its bioactive pyrophosphate derivative (TPP) without the participation of protein factors. The mRNA-effector complex adopts a distinct structure that sequesters the ribosome-binding site and leads to a reduction in gene expression. This metabolite-sensing mRNA system provides an example of a genetic “riboswitch” (referred to herein as a riboswitch) whose origin might predate the evolutionary emergence of proteins. It has been discovered that the mRNA leader sequence of the btuB gene of Escherichia coli can bind coenzyme B12 selectively, and that this binding event brings about a structural change in the RNA that is important for genetic control (see Example 1). It was also discovered that mRNAs that encode thiamine biosynthetic proteins also employ a riboswitch mechanism (see Example 2).

It was also discovered that the 5″-UTR of the lysC gene of Bacillus subtilis carries a conserved RNA element that serves as a lysine-responsive riboswitch. The ligand-binding domain of the riboswitch binds to L-lysine with an apparent dissociation constant (KD) of approximately 1 μM, and exhibits a high level of molecular discrimination against closely related analogs including D-lysine and ornithine. This widespread class of riboswitches serves as a target for the antimicrobial agent thiosine.

It was also discovered that the xpt-pbuX operon (Christiansen, L. C., et al., 1997, J. Bacteriol. 179, 2540-2550) is controlled by a riboswitch that exhibits high affinity and high selectivity for guanine This class of riboswitches is present in the 5′-untranslated region (5′-UTR) of five transcriptional units in B. subtilis, including that of the 12-gene pur operon. Direct binding of guanine by mRNAs serves as a critical determinant of metabolic homeostasis for purine metabolism in certain bacteria. Furthermore, the discovered classes of riboswitches, which respond to seven distinct target molecules, control at least 68 genes in Bacillus subtilis that are of fundamental importance to central metabolic pathways.

It was discovered that a highly conserved RNA domain termed the S box serves as a selective aptamer for SAM. Allosteric modulation of secondary and tertiary structures are induced upon SAM binding to the aptamer domain, and these structural changes are responsible for inducing termination of mRNA transcription.

A variant class of riboswitches that responds to adenine is also disclosed. These riboswitches carry an aptamer domain that corresponds closely in sequence and secondary structure to the guanine aptamer. However, each representative of the adenine sub-class of riboswitches carries a C to U mutation in the conserved core of the aptamer, indicating that this residue is involved in metabolite recognition. The identity of this single nucleotide determines the binding specificity between guanine and adenine, which provides an example of how complex riboswitch structures can be mutated to recognize new metabolite targets.

Although the specific natural riboswitches disclosed herein are the first examples of mRNA elements that control genetic expression by metabolite binding, it is expected that this genetic control strategy is widespread in biology. It has been suggested (White III, Coenzymes as fossils of an earlier metabolic state. J. Mol. Evol. 7, 101-104 (1976); White III, In: The Pyridine Nucleotide Coenzymes. Acad. Press, NY pp. 1-17 (1982); Benner et al., Modern metabolism as a palimpsest of the RNA world. Proc. Natl. Acad. Sci. USA 86, 7054-7058 (1989)) that TPP, coenzyme B12 and FMN emerged as biological cofactors during the RNA world (Joyce, The antiquity of RNA-based evolution. Nature 418, 214-221 (2002)). If these metabolites were being biosynthesized and used before the advent of proteins, then certain riboswitches might be modern examples of the most ancient form of genetic control. A search of genomic sequence databases has revealed that sequences corresponding to the TPP aptamer exist in organisms from bacteria, archaea and eukarya—largely without major alteration. Although new metabolite-binding mRNAs are likely to emerge as evolution progresses, it is possible that the known riboswitches are molecular fossils from the RNA world.

Disclosed are mRNA elements that have been identified in fungi and in plants that match the consensus sequence and structure of thiamine pyrophosphate-binding domains of prokaryotes. In Arabidopsis, the consensus motif resides in the 3′-UTR of a thiamine biosynthetic gene, and the isolated RNA domain binds the corresponding coenzyme in vitro. These results indicate that metabolite-binding mRNAs are involved in eukaryotic gene regulation and that some riboswitches might be representatives of an ancient form of genetic control.

It is to be understood that the disclosed method and compositions are not limited to specific synthetic methods, specific analytical techniques, or to particular reagents unless otherwise specified, and, as such, can vary. It is also to be understood that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting.

Materials

Disclosed are materials, compositions, and components that can be used for, can be used in conjunction with, can be used in preparation for, or are products of the disclosed methods and compositions. These and other materials are disclosed herein, and it is understood that when combinations, subsets, interactions, groups, etc. of these materials are disclosed that while specific reference to each of various individual and collective combinations and permutation of these compounds can not be explicitly disclosed, each is specifically contemplated and described herein. For example, if a riboswitch or aptamer domain is disclosed and discussed and a number of modifications that can be made to a number of molecules including the riboswitch or aptamer domain are discussed, each and every combination and permutation of riboswitch or aptamer domain and the modifications that are possible are specifically contemplated unless specifically indicated to the contrary. Thus, if a class of molecules A, B, and C are disclosed as well as a class of molecules D, E, and F and an example of a combination molecule, A-D is disclosed, then even if each is not individually recited, each is individually and collectively contemplated. Thus, in this example, each of the combinations A-E, A-F, B-D, B-E, B-F, C-D, C-E, and C-F are specifically contemplated and should be considered disclosed from disclosure of A, B, and C; D, E, and F; and the example combination A-D. Likewise, any subset or combination of these is also specifically contemplated and disclosed. Thus, for example, the sub-group of A-E, B-F, and C-E are specifically contemplated and should be considered disclosed from disclosure of A, B, and C; D, E, and F; and the example combination A-D. This concept applies to all aspects of this application including, but not limited to, steps in methods of making and using the disclosed compositions. Thus, if there are a variety of additional steps that can be performed it is understood that each of these additional steps can be performed with any specific embodiment or combination of embodiments of the disclosed methods, and that each such combination is specifically contemplated and should be considered disclosed.

A. Riboswitches

Riboswitches are expression control elements that are part of the RNA molecule to be expressed and that change state when bound by a trigger molecule. Riboswitches typically can be dissected into two separate domains: one that selectively binds the target (aptamer domain) and another that influences genetic control (expression platform domain). It is the dynamic interplay between these two domains that results in metabolite-dependent allosteric control of gene expression. Disclosed are isolated and recombinant riboswitches, recombinant constructs containing such riboswitches, heterologous sequences operably linked to such riboswitches, and cells and transgenic organisms harboring such riboswitches, riboswitch recombinant constructs, and riboswitches operably linked to heterologous sequences. The heterologous sequences can be, for example, sequences encoding proteins or peptides of interest, including reporter proteins or peptides. Preferred riboswitches are, or are derived from, naturally occurring riboswitches.

The disclosed riboswitches, including the derivatives and recombinant forms thereof, generally can be from any source, including naturally occurring riboswitches and riboswitches designed de novo. Any such riboswitches can be used in or with the disclosed methods. However, different types of riboswitches can be defined and some such sub-types can be useful in or with particular methods (generally as described elsewhere herein). Types of riboswitches include, for example, naturally occurring riboswitches, derivatives and modified forms of naturally occurring riboswitches, chimeric riboswitches, and recombinant riboswitches. A naturally occurring riboswitch is a riboswitch having the sequence of a riboswitch as found in nature. Such a naturally occurring riboswitch can be an isolated or recombinant form of the naturally occurring riboswitch as it occurs in nature. That is, the riboswitch has the same primary structure but has been isolated or engineered in a new genetic or nucleic acid context. Chimeric riboswitches can be made up of, for example, part of a riboswitch of any or of a particular class or type of riboswitch and part of a different riboswitch of the same or of any different class or type of riboswitch; part of a riboswitch of any or of a particular class or type of riboswitch and any non-riboswitch sequence or component. Recombinant riboswitches are riboswitches that have been isolated or engineered in a new genetic or nucleic acid context.

Different classes of riboswitches refer to riboswitches that have the same or similar trigger molecules or riboswitches that have the same or similar overall structure (predicted, determined, or a combination). Riboswitches of the same class generally, but need not, have both the same or similar trigger molecules and the same or similar overall structure.

Also disclosed are chimeric riboswitches containing heterologous aptamer domains and expression platform domains. That is, chimeric riboswitches are made up an aptamer domain from one source and an expression platform domain from another source. The heterologous sources can be from, for example, different specific riboswitches, different types of riboswitches, or different classes of riboswitches. The heterologous aptamers can also come from non-riboswitch aptamers. The heterologous expression platform domains can also come from non-riboswitch sources.

Riboswitches can be modified from other known, developed or naturally-occurring riboswitches. For example, switch domain portions can be modified by changing one or more nucleotides while preserving the known or predicted secondary, tertiary, or both secondary and tertiary structure of the riboswitch. For example, both nucleotides in a base pair can be changed to nucleotides that can also base pair. Changes that allow retention of base pairing are referred to herein as base pair conservative changes.

Modified or derivative riboswitches can also be produced using in vitro selection and evolution techniques. In general, in vitro evolution techniques as applied to riboswitches involve producing a set of variant riboswitches where part(s) of the riboswitch sequence is varied while other parts of the riboswitch are held constant. Activation, deactivation or blocking (or other functional or structural criteria) of the set of variant riboswitches can then be assessed and those variant riboswitches meeting the criteria of interest are selected for use or further rounds of evolution. Useful base riboswitches for generation of variants are the specific and consensus riboswitches disclosed herein. Consensus riboswitches can be used to inform which part(s) of a riboswitch to vary for in vitro selection and evolution.

Also disclosed are modified riboswitches with altered regulation. The regulation of a riboswitch can be altered by operably linking an aptamer domain to the expression platform domain of the riboswitch (which is a chimeric riboswitch). The aptamer domain can then mediate regulation of the riboswitch through the action of, for example, a trigger molecule for the aptamer domain. Aptamer domains can be operably linked to expression platform domains of riboswitches in any suitable manner, including, for example, by replacing the normal or natural aptamer domain of the riboswitch with the new aptamer domain. Generally, any compound or condition that can activate, deactivate or block the riboswitch from which the aptamer domain is derived can be used to activate, deactivate or block the chimeric riboswitch.

Also disclosed are inactivated riboswitches. Riboswitches can be inactivated by covalently altering the riboswitch (by, for example, crosslinking parts of the riboswitch or coupling a compound to the riboswitch). Inactivation of a riboswitch in this manner can result from, for example, an alteration that prevents the trigger molecule for the riboswitch from binding, that prevents the change in state of the riboswitch upon binding of the trigger molecule, or that prevents the expression platform domain of the riboswitch from affecting expression upon binding of the trigger molecule.

Also disclosed are biosensor riboswitches. Biosensor riboswitches are engineered riboswitches that produce a detectable signal in the presence of their cognate trigger molecule. Useful biosensor riboswitches can be triggered at or above threshold levels of the trigger molecules. Biosensor riboswitches can be designed for use in vivo or in vitro. For example, biosensor riboswitches operably linked to a reporter RNA that encodes a protein that serves as or is involved in producing a signal can be used in vivo by engineering a cell or organism to harbor a nucleic acid construct encoding the riboswitch/reporter RNA. An example of a biosensor riboswitch for use in vitro is a riboswitch that includes a conformation dependent label, the signal from which changes depending on the activation state of the riboswitch. Such a biosensor riboswitch preferably uses an aptamer domain from or derived from a naturally occurring riboswitch. Biosensor riboswitches can be used in various situations and platforms. For example, biosensor riboswitches can be used with solid supports, such as plates, chips, strips and wells.

Also disclosed are modified or derivative riboswitches that recognize new trigger molecules. New riboswitches and/or new aptamers that recognize new trigger molecules can be selected for, designed or derived from known riboswitches. This can be accomplished by, for example, producing a set of aptamer variants in a riboswitch, assessing the activation of the variant riboswitches in the presence of a compound of interest, selecting variant riboswitches that were activated (or, for example, the riboswitches that were the most highly or the most selectively activated), and repeating these steps until a variant riboswitch of a desired activity, specificity, combination of activity and specificity, or other combination of properties results.

Particularly useful aptamer domains can form a stem structure referred to herein as the P1 stem structure (or simply P1). The P1 stems of a variety of riboswitches are shown in FIG. 11 (and in other figures). The hybridizing strands in the P1 stem structure are referred to as the aptamer strand (also referred to as the P1a strand) and the control strand (also referred to as the P1b strand). The control strand can form a stem structure with both the aptamer strand and a sequence in a linked expression platform that is referred to as the regulated strand (also referred to as the P1c strand). Thus, the control strand (P1b) can form alternative stem structures with the aptamer strand (P1a) and the regulated strand (P1c). Activation and deactivation of a riboswitch results in a shift from one of the stem structures to the other (from P1a/P1b to P1b/P1c or vice versa). The formation of the P1b/P1c stem structure affects expression of the RNA molecule containing the riboswitch. Riboswitches that operate via this control mechanism are referred to herein as alternative stem structure riboswitches (or as alternative stem riboswitches).

In general, any aptamer domain can be adapted for use with any expression platform domain by designing or adapting a regulated strand in the expression platform domain to be complementary to the control strand of the aptamer domain. Alternatively, the sequence of the aptamer and control strands of an aptamer domain can be adapted so that the control strand is complementary to a functionally significant sequence in an expression platform. For example, the control strand can be adapted to be complementary to the Shine-Dalgarno sequence of an RNA such that, upon formation of a stem structure between the control strand and the SD sequence, the SD sequence becomes inaccessible to ribosomes, thus reducing or preventing translation initiation. Note that the aptamer strand would have corresponding changes in sequence to allow formation of a P1 stem in the aptamer domain.

As another example, a transcription terminator can be added to an RNA molecule (most conveniently in an untranslated region of the RNA) where part of the sequence of the transcription terminator is complementary to the control strand of an aptamer domain (the sequence will be the regulated strand). This will allow the control sequence of the aptamer domain to form alternative stem structures with the aptamer strand and the regulated strand, thus either forming or disrupting a transcription terminator stem upon activation or deactivation of the riboswitch. Any other expression element can be brought under the control of a riboswitch by similar design of alternative stem structures.

For transcription terminators controlled by riboswitches, the speed of transcription and spacing of the riboswitch and expression platform elements can be important for proper control. Transcription speed can be adjusted by, for example, by including polymerase pausing elements (e.g., a series of uridine residues) to pause transcription and allow the riboswitch to form and sense trigger molecules. For example, with the FMN riboswitch, if FMN is bound to its aptamer domain, then the antiterminator sequence is sequestered and is unavailable for formation of an antiterminator structure (FIG. 12). However, if FMN is absent, the antiterminator can form once its nucleotides emerge from the polymerase. RNAP then breaks free of the pause site only to reach another U-stretch and pause again. The transcriptional terminator then forms only if the terminator nucleotides are not tied up by the antiterminator.

Disclosed are regulatable gene expression constructs comprising a nucleic acid molecule encoding an RNA comprising a riboswitch operably linked to a coding region, wherein the riboswitch regulates expression of the RNA, wherein the riboswitch and coding region are heterologous. The riboswitch can comprise an aptamer domain and an expression platform domain, wherein the aptamer domain and the expression platform domain are heterologous. The riboswitch can comprise an aptamer domain and an expression platform domain, wherein the aptamer domain comprises a P1 stem, wherein the P1 stem comprises an aptamer strand and a control strand, wherein the expression platform domain comprises a regulated strand, wherein the regulated strand, the control strand, or both have been designed to form a stem structure.

Disclosed are riboswitches, wherein the riboswitch is a non-natural derivative of a naturally-occurring riboswitch. The riboswitch can comprise an aptamer domain and an expression platform domain, wherein the aptamer domain and the expression platform domain are heterologous. The riboswitch can be derived from a naturally-occuring guanine-responsive riboswitch, adenine-responsive riboswitch, lysine-responsive riboswitch, thiamine pyrophosphate-responsive riboswitch, adenosylcobalamin-responsive riboswitch, flavin mononucleotide-responsive riboswitch, or a S-adenosylmethionine-responsive riboswitch. The riboswitch can be activated by a trigger molecule, wherein the riboswitch produces a signal when activated by the trigger molecule.

Numerous riboswitches and riboswitch constructs are described and referred to herein. It is specifically contemplated that any specific riboswitch or riboswitch construct or group of riboswitches or riboswitch constructs can be excluded from some aspects of the invention disclosed herein. For example, fusion of the xpt-pbuX riboswitch with a reporter gene could be excluded from a set of riboswitches fused to reporter genes.

1. Aptamer Domains

Aptamers are nucleic acid segments and structures that can bind selectively to particular compounds and classes of compounds. Riboswitches have aptamer domains that, upon binding of a trigger molecule result in a change the state or structure of the riboswitch. In functional riboswitches, the state or structure of the expression platform domain linked to the aptamer domain changes when the trigger molecule binds to the aptamer domain. Aptamer domains of riboswitches can be derived from any source, including, for example, natural aptamer domains of riboswitches, artificial aptamers, engineered, selected, evolved or derived aptamers or aptamer domains. Aptamers in riboswitches generally have at least one portion that can interact, such as by forming a stem structure, with a portion of the linked expression platform domain. This stem structure will either form or be disrupted upon binding of the trigger molecule.

Consensus aptamer domains of a variety of natural riboswitches are shown in FIG. 11. These aptamer domains (including all of the direct variants embodied therein) can be used in riboswitches. The consensus sequences and structures indicate variations in sequence and structure. Aptamer domains that are within the indicated variations are referred to herein as direct variants. These aptamer domains can be modified to produce modified or variant aptamer domains. Conservative modifications include any change in base paired nucleotides such that the nucleotides in the pair remain complementary. Moderate modifications include changes in the length of stems or of loops (for which a length or length range is indicated) of less than or equal to 20% of the length range indicated. Loop and stem lengths are considered to be “indicated” where the consensus structure shows a stem or loop of a particular length or where a range of lengths is listed or depicted. Moderate modifications include changes in the length of stems or of loops (for which a length or length range is not indicated) of less than or equal to 40% of the length range indicated. Moderate modifications also include and functional variants of unspecified portions of the aptamer domain. Unspecified portions of the aptamer domains are indicated by solid lines in FIG. 11.

The P1 stem and its constituent strands can be modified in adapting aptamer domains for use with expression platforms and RNA molecules. Such modifications, which can be extensive, are referred to herein as P1 modifications. P1 modifications include changes to the sequence and/or length of the P1 stem of an aptamer domain.

The aptamer domains shown in FIG. 11 (including any direct variants) are particularly useful as initial sequences for producing derived aptamer domains via in vitro selection or in vitro evolution techniques.

Aptamer domains of the disclosed riboswitches can also be used for any other purpose, and in any other context, as aptamers. For example, aptamers can be used to control ribozymes, other molecular switches, and any RNA molecule where a change in structure can affect function of the RNA.

2. Expression Platform Domains

Expression platform domains are a part of riboswitches that affect expression of the RNA molecule that contains the riboswitch. Expression platform domains generally have at least one portion that can interact, such as by forming a stem structure, with a portion of the linked aptamer domain. This stem structure will either form or be disrupted upon binding of the trigger molecule. The stem structure generally either is, or prevents formation of, an expression regulatory structure. An expression regulatory structure is a structure that allows, prevents, enhances or inhibits expression of an RNA molecule containing the structure. Examples include Shine-Dalgarno sequences, initiation codons, transcription terminators, and stability and processing signals.

B. Trigger Molecules

Trigger molecules are molecules and compounds that can activate a riboswitch. This includes the natural or normal trigger molecule for the riboswitch and other compounds that can activate the riboswitch. Natural or normal trigger molecules are the trigger molecule for a given riboswitch in nature or, in the case of some non-natural riboswitches, the trigger molecule for which the riboswitch was designed or with which the riboswitch was selected (as in, for example, in vitro selection or in vitro evolution techniques). Non-natural trigger molecules can be referred to as non-natural trigger molecules.

C. Compounds

Also disclosed are compounds, and compositions containing such compounds, that can activate, deactivate or block a riboswitch. Riboswitches function to control gene expression through the binding or removal of a trigger molecule. Compounds can be used to activate, deactivate or block a riboswitch. The trigger molecule for a riboswitch (as well as other activating compounds) can be used to activate a riboswitch. Compounds other than the trigger molecule generally can be used to deactivate or block a riboswitch. Riboswitches can also be deactivated by, for example, removing trigger molecules from the presence of the riboswitch. A riboswitch can be blocked by, for example, binding of an analog of the trigger molecule that does not activate the riboswitch.

Also disclosed are compounds for altering expression of an RNA molecule, or of a gene encoding an RNA molecule, where the RNA molecule includes a riboswitch. This can be accomplished by bringing a compound into contact with the RNA molecule. Riboswitches function to control gene expression through the binding or removal of a trigger molecule. Thus, subjecting an RNA molecule of interest that includes a riboswitch to conditions that activate, deactivate or block the riboswitch can be used to alter expression of the RNA. Expression can be altered as a result of, for example, termination of transcription or blocking of ribosome binding to the RNA. Binding of a trigger molecule can, depending on the nature of the riboswitch, reduce or prevent expression of the RNA molecule or promote or increase expression of the RNA molecule.

Also disclosed are compounds for regulating expression of an RNA molecule, or of a gene encoding an RNA molecule. Also disclosed are compounds for regulating expression of a naturally occurring gene or RNA that contains a riboswitch by activating, deactivating or blocking the riboswitch. If the gene is essential for survival of a cell or organism that harbors it, activating, deactivating or blocking the riboswitch can in death, stasis or debilitation of the cell or organism.

Also disclosed are compounds for regulating expression of an isolated, engineered or recombinant gene or RNA that contains a riboswitch by activating, deactivating or blocking the riboswitch. If the gene encodes a desired expression product, activating or deactivating the riboswitch can be used to induce expression of the gene and thus result in production of the expression product. If the gene encodes an inducer or repressor of gene expression or of another cellular process, activation, deactivation or blocking of the riboswitch can result in induction, repression, or de-repression of other, regulated genes or cellular processes. Many such secondary regulatory effects are known and can be adapted for use with riboswitches. An advantage of riboswitches as the primary control for such regulation is that riboswitch trigger molecules can be small, non-antigenic molecules.

Also disclosed are methods of identifying compounds that activate, deactivate or block a riboswitch. For examples, compounds that activate a riboswitch can be identified by bringing into contact a test compound and a riboswitch and assessing activation of the riboswitch. If the riboswitch is activated, the test compound is identified as a compound that activates the riboswitch. Activation of a riboswitch can be assessed in any suitable manner. For example, the riboswitch can be linked to a reporter RNA and expression, expression level, or change in expression level of the reporter RNA can be measured in the presence and absence of the test compound. As another example, the riboswitch can include a conformation dependent label, the signal from which changes depending on the activation state of the riboswitch. Such a riboswitch preferably uses an aptamer domain from or derived from a naturally occurring riboswitch. As can be seen, assessment of activation of a riboswitch can be performed with the use of a control assay or measurement or without the use of a control assay or measurement. Methods for identifying compounds that deactivate a riboswitch can be performed in analogous ways.

Identification of compounds that block a riboswitch can be accomplished in any suitable manner. For example, an assay can be performed for assessing activation or deactivation of a riboswitch in the presence of a compound known to activate or deactivate the riboswitch and in the presence of a test compound. If activation or deactivation is not observed as would be observed in the absence of the test compound, then the test compound is identified as a compound that blocks activation or deactivation of the riboswitch.

Also disclosed are compounds made by identifying a compound that activates, deactivates or blocks a riboswitch and manufacturing the identified compound. This can be accomplished by, for example, combining compound identification methods as disclosed elsewhere herein with methods for manufacturing the identified compounds. For example, compounds can be made by bringing into contact a test compound and a riboswitch, assessing activation of the riboswitch, and, if the riboswitch is activated by the test compound, manufacturing the test compound that activates the riboswitch as the compound.

Also disclosed are compounds made by checking activation, deactivation or blocking of a riboswitch by a compound and manufacturing the checked compound. This can be accomplished by, for example, combining compound activation, deactivation or blocking assessment methods as disclosed elsewhere herein with methods for manufacturing the checked compounds. For example, compounds can be made by bringing into contact a test compound and a riboswitch, assessing activation of the riboswitch, and, if the riboswitch is activated by the test compound, manufacturing the test compound that activates the riboswitch as the compound. Checking compounds for their ability to activate, deactivate or block a riboswitch refers to both identification of compounds previously unknown to activate, deactivate or block a riboswitch and to assessing the ability of a compound to activate, deactivate or block a riboswitch where the compound was already known to activate, deactivate or block the riboswitch.

Specific compounds that can be used to activate riboswitches are also disclosed. Compounds useful with guanine-responsive riboswitches (and riboswitches derived from guanine-responsive riboswitches) include compounds having the formula

each independently represent a single or double bond.

Every compound within the above definition is intended to be and should be considered to be specifically disclosed herein. Further, every subgroup that can be identified within the above definition is intended to be and should be considered to be specifically disclosed herein. As a result, it is specifically contemplated that any compound, or subgroup of compounds can be either specifically included for or excluded from use or included in or excluded from a list of compounds. For example, as one option, a group of compounds is contemplated where each compound is as defined above but is not guanine, hypoxanthine, xanthine, or N2-methylguanine. As another example, a group of compounds is contemplated where each compound is as defined above and is able to activate a guanine-responsive riboswitch.

Compounds useful with adenine-responsive riboswitches (and riboswitches derived from adenine-responsive riboswitches) include compounds having the formula

each independently represent a single or double bond.

Every compound within the above definition is intended to be and should be considered to be specifically disclosed herein. Further, every subgroup that can be identified within the above definition is intended to be and should be considered to be specifically disclosed herein. As a result, it is specifically contemplated that any compound, or subgroup of compounds can be either specifically included for or excluded from use or included in or excluded from a list of compounds. For example, as one option, a group of compounds is contemplated where each compound is as defined above but is not adenine, 2,6-diaminopurine, or 2-amino purine. As another example, a group of compounds is contemplated where each compound is as defined above and is able to activate an adenine-responsive riboswitch.

Compounds useful with lysine-responsive riboswitches (and riboswitches derived from lysine-responsive riboswitches) include compounds having the formula

each independently represent a single or double bond. Also contemplated are compounds as defined above where R2 and R3 are each NH3+ and where R1 is O−.

Every compound within the above definition is intended to be and should be considered to be specifically disclosed herein. Further, every subgroup that can be identified within the above definition is intended to be and should be considered to be specifically disclosed herein. As a result, it is specifically contemplated that any compound, or subgroup of compounds can be either specifically included for or excluded from use or included in or excluded from a list of compounds. For example, as one option, a group of compounds is contemplated where each compound is as defined above but is not lysine. As another example, a group of compounds is contemplated where each compound is as defined above and is able to activate a lysine-responsive riboswitch.

Compounds useful with TPP-responsive riboswitches (and riboswitches derived from lysine-responsive riboswitches) include compounds having the formula

each independently represent a single or double bond. Also contemplated are compounds as defined above where R1 is phosphate, diphosphate or triphosphate.

Every compound within the above definition is intended to be and should be considered to be specifically disclosed herein. Further, every subgroup that can be identified within the above definition is intended to be and should be considered to be specifically disclosed herein. As a result, it is specifically contemplated that any compound, or subgroup of compounds can be either specifically included for or excluded from use or included in or excluded from a list of compounds. For example, as one option, a group of compounds is contemplated where each compound is as defined above but is not TPP, TP or thiamine. As another example, a group of compounds is contemplated where each compound is as defined above and is able to activate a TPP-responsive riboswitch.

D. Constructs, Vectors and Expression Systems

The disclosed riboswitches can be used in with any suitable expression system. Recombinant expression is usefully accomplished using a vector, such as a plasmid. The vector can include a promoter operably linked to riboswitch-encoding sequence and RNA to be expression (e.g., RNA encoding a protein). The vector can also include other elements required for transcription and translation. As used herein, vector refers to any carrier containing exogenous DNA. Thus, vectors are agents that transport the exogenous nucleic acid into a cell without degradation and include a promoter yielding expression of the nucleic acid in the cells into which it is delivered. Vectors include but are not limited to plasmids, viral nucleic acids, viruses, phage nucleic acids, phages, cosmids, and artificial chromosomes. A variety of prokaryotic and eukaryotic expression vectors suitable for carrying riboswitch-regulated constructs can be produced. Such expression vectors include, for example, pET, pET3d, pCR2.1, pBAD, pUC, and yeast vectors. The vectors can be used, for example, in a variety of in vivo and in vitro situation.

Viral vectors include adenovirus, adeno-associated virus, herpes virus, vaccinia virus, polio virus, AIDS virus, neuronal trophic virus, Sindbis and other RNA viruses, including these viruses with the HIV backbone. Also useful are any viral families which share the properties of these viruses which make them suitable for use as vectors. Retroviral vectors, which are described in Verma (1985), include Murine Maloney Leukemia virus, MMLV, and retroviruses that express the desirable properties of MMLV as a vector. Typically, viral vectors contain, nonstructural early genes, structural late genes, an RNA polymerase III transcript, inverted terminal repeats necessary for replication and encapsidation, and promoters to control the transcription and replication of the viral genome. When engineered as vectors, viruses typically have one or more of the early genes removed and a gene or gene/promotor cassette is inserted into the viral genome in place of the removed viral DNA.

A “promoter” is generally a sequence or sequences of DNA that function when in a relatively fixed location in regard to the transcription start site. A “promoter” contains core elements required for basic interaction of RNA polymerase and transcription factors and can contain upstream elements and response elements.

“Enhancer” generally refers to a sequence of DNA that functions at no fixed distance from the transcription start site and can be either 5′ (Laimins, 1981) or 3′ (Lusky et al., 1983) to the transcription unit. Furthermore, enhancers can be within an intron (Banerji et al., 1983) as well as within the coding sequence itself (Osborne et al., 1984). They are usually between 10 and 300 by in length, and they function in cis. Enhancers function to increase transcription from nearby promoters. Enhancers, like promoters, also often contain response elements that mediate the regulation of transcription. Enhancers often determine the regulation of expression.

Expression vectors used in eukaryotic host cells (yeast, fungi, insect, plant, animal, human or nucleated cells) can also contain sequences necessary for the termination of transcription which can affect mRNA expression. These regions are transcribed as polyadenylated segments in the untranslated portion of the mRNA encoding tissue factor protein. The 3′ untranslated regions also include transcription termination sites. It is preferred that the transcription unit also contain a polyadenylation region. One benefit of this region is that it increases the likelihood that the transcribed unit will be processed and transported like mRNA. The identification and use of polyadenylation signals in expression constructs is well established. It is preferred that homologous polyadenylation signals be used in the transgene constructs.

The vector can include nucleic acid sequence encoding a marker product. This marker product is used to determine if the gene has been delivered to the cell and once delivered is being expressed. Preferred marker genes are the E. Coli lacZ gene which encodes β-galactosidase and green fluorescent protein.

In some embodiments the marker can be a selectable marker. When such selectable markers are successfully transferred into a host cell, the transformed host cell can survive if placed under selective pressure. There are two widely used distinct categories of selective regimes. The first category is based on a cell\'s metabolism and the use of a mutant cell line which lacks the ability to grow independent of a supplemented media. The second category is dominant selection which refers to a selection scheme used in any cell type and does not require the use of a mutant cell line. These schemes typically use a drug to arrest growth of a host cell. Those cells which have a novel gene would express a protein conveying drug resistance and would survive the selection. Examples of such dominant selection use the drugs neomycin, (Southern and Berg, 1982), mycophenolic acid, (Mulligan and Berg, 1980) or hygromycin (Sugden et al., 1985).

Gene transfer can be obtained using direct transfer of genetic material, in but not limited to, plasmids, viral vectors, viral nucleic acids, phage nucleic acids, phages, cosmids, and artificial chromosomes, or via transfer of genetic material in cells or carriers such as cationic liposomes. Such methods are well known in the art and readily adaptable for use in the method described herein. Transfer vectors can be any nucleotide construction used to deliver genes into cells (e.g., a plasmid), or as part of a general strategy to deliver genes, e.g., as part of recombinant retrovirus or adenovirus (Ram et al. Cancer Res. 53:83-88, (1993)). Appropriate means for transfection, including viral vectors, chemical transfectants, or physico-mechanical methods such as electroporation and direct diffusion of DNA, are described by, for example, Wolff, J. A., et al., Science, 247, 1465-1468, (1990); and Wolff, J. A. Nature, 352, 815-818, (1991).

1. Viral Vectors

Preferred viral vectors are Adenovirus, Adeno-associated virus, Herpes virus, Vaccinia virus, Polio virus, AIDS virus, neuronal trophic virus, Sindbis and other RNA viruses, including these viruses with the HIV backbone. Also preferred are any viral families which share the properties of these viruses which make them suitable for use as vectors. Preferred retroviruses include Murine Maloney Leukemia virus, MMLV, and retroviruses that express the desirable properties of MMLV as a vector. Retroviral vectors are able to carry a larger genetic payload, i.e., a transgene or marker gene, than other viral vectors, and for this reason are a commonly used vector. However, they are not useful in non-proliferating cells. Adenovirus vectors are relatively stable and easy to work with, have high titers, and can be delivered in aerosol formulation, and can transfect non-dividing cells. Pox viral vectors are large and have several sites for inserting genes, they are thermostable and can be stored at room temperature. A preferred embodiment is a viral vector which has been engineered so as to suppress the immune response of the host organism, elicited by the viral antigens. Preferred vectors of this type will carry coding regions for Interleukin 8 or 10.

Viral vectors have higher transaction (ability to introduce genes) abilities than do most chemical or physical methods to introduce genes into cells. Typically, viral vectors contain, nonstructural early genes, structural late genes, an RNA polymerase III transcript, inverted terminal repeats necessary for replication and encapsidation, and promoters to control the transcription and replication of the viral genome. When engineered as vectors, viruses typically have one or more of the early genes removed and a gene or gene/promotor cassette is inserted into the viral genome in place of the removed viral DNA. Constructs of this type can carry up to about 8 kb of foreign genetic material. The necessary functions of the removed early genes are typically supplied by cell lines which have been engineered to express the gene products of the early genes in trans.

i. Retroviral Vectors

A retrovirus is an animal virus belonging to the virus family of Retroviridae, including any types, subfamilies, genus, or tropisms. Retroviral vectors, in general, are described by Verma, I. M., Retroviral vectors for gene transfer. In Microbiology-1985, American Society for Microbiology, pp. 229-232, Washington, (1985), which is incorporated by reference herein. Examples of methods for using retroviral vectors for gene therapy are described in U.S. Pat. Nos. 4,868,116 and 4,980,286; PCT applications WO 90/02806 and WO 89/07136; and Mulligan, (Science 260:926-932 (1993)); the teachings of which are incorporated herein by reference.

A retrovirus is essentially a package which has packed into it nucleic acid cargo. The nucleic acid cargo carries with it a packaging signal, which ensures that the replicated daughter molecules will be efficiently packaged within the package coat. In addition to the package signal, there are a number of molecules which are needed in cis, for the replication, and packaging of the replicated virus. Typically a retroviral genome, contains the gag, pol, and env genes which are involved in the making of the protein coat. It is the gag, pol, and env genes which are typically replaced by the foreign DNA that it is to be transferred to the target cell. Retrovirus vectors typically contain a packaging signal for incorporation into the package coat, a sequence which signals the start of the gag transcription unit, elements necessary for reverse transcription, including a primer binding site to bind the tRNA primer of reverse transcription, terminal repeat sequences that guide the switch of RNA strands during DNA synthesis, a purine rich sequence 5′ to the 3′ LTR that serve as the priming site for the synthesis of the second strand of DNA synthesis, and specific sequences near the ends of the LTRs that enable the insertion of the DNA state of the retrovirus to insert into the host genome. The removal of the gag, pol, and env genes allows for about 8 kb of foreign sequence to be inserted into the viral genome, become reverse transcribed, and upon replication be packaged into a new retroviral particle. This amount of nucleic acid is sufficient for the delivery of a one to many genes depending on the size of each transcript. It is preferable to include either positive or negative selectable markers along with other genes in the insert.

Since the replication machinery and packaging proteins in most retroviral vectors have been removed (gag, pol, and env), the vectors are typically generated by placing them into a packaging cell line. A packaging cell line is a cell line which has been transfected or transformed with a retrovirus that contains the replication and packaging machinery, but lacks any packaging signal. When the vector carrying the DNA of choice is transfected into these cell lines, the vector containing the gene of interest is replicated and packaged into new retroviral particles, by the machinery provided in cis by the helper cell. The genomes for the machinery are not packaged because they lack the necessary signals.

ii. Adenoviral Vectors

The construction of replication-defective adenoviruses has been described (Berkner et al., J. Virology 61:1213-1220 (1987); Massie et al., Mol. Cell. Biol. 6:2872-2883 (1986); Haj-Ahmad et al., J. Virology 57:267-274 (1986); Davidson et al., J. Virology 61:1226-1239 (1987); Zhang “Generation and identification of recombinant adenovirus by liposome-mediated transfection and PCR analysis” BioTechniques 15:868-872 (1993)). The benefit of the use of these viruses as vectors is that they are limited in the extent to which they can spread to other cell types, since they can replicate within an initial infected cell, but are unable to form new infectious viral particles. Recombinant adenoviruses have been shown to achieve high efficiency gene transfer after direct, in vivo delivery to airway epithelium, hepatocytes, vascular endothelium, CNS parenchyma and a number of other tissue sites (Morsy, J. Clin. Invest. 92:1580-1586 (1993); Kirshenbaum, J. Clin. Invest. 92:381-387 (1993); Roessler, J. Clin. Invest. 92:1085-1092 (1993); Moullier, Nature Genetics 4:154-159 (1993); La Salle, Science 259:988-990 (1993); Gomez-Foix, J. Biol. Chem. 267:25129-25134 (1992); Rich, Human Gene Therapy 4:461-476 (1993); Zabner, Nature Genetics 6:75-83 (1994); Guzman, Circulation Research 73:1201-1207 (1993); Bout, Human Gene Therapy 5:3-10 (1994); Zabner, Cell 75:207-216 (1993); Caillaud, Eur. J. Neuroscience 5:1287-1291 (1993); and Ragot, J. Gen. Virology 74:501-507 (1993)). Recombinant adenoviruses achieve gene transduction by binding to specific cell surface receptors, after which the virus is internalized by receptor-mediated endocytosis, in the same manner as wild type or replication-defective adenovirus (Chardonnet and Dales, Virology 40:462-477 (1970); Brown and Burlingham, J. Virology 12:386-396 (1973); Svensson and Persson, J. Virology 55:442-449 (1985); Seth, et al., J. Virol. 51:650-655 (1984); Seth, et al., Mol. Cell. Biol. 4:1528-1533 (1984); Varga et al., J. Virology 65:6061-6070 (1991); Wickham et al., Cell 73:309-319 (1993)).

A preferred viral vector is one based on an adenovirus which has had the E1 gene removed and these virons are generated in a cell line such as the human 293 cell line. In another preferred embodiment both the E1 and E3 genes are removed from the adenovirus genome.

Another type of viral vector is based on an adeno-associated virus (AAV). This defective parvovirus is a preferred vector because it can infect many cell types and is nonpathogenic to humans. AAV type vectors can transport about 4 to 5 kb and wild type AAV is known to stably insert into chromosome 19. Vectors which contain this site specific integration property are preferred. An especially preferred embodiment of this type of vector is the P4.1 C vector produced by Avigen, San Francisco, Calif., which can contain the herpes simplex virus thymidine kinase gene, HSV-tk, and/or a marker gene, such as the gene encoding the green fluorescent protein, GFP.

The inserted genes in viral and retroviral usually contain promoters, and/or enhancers to help control the expression of the desired gene product. A promoter is generally a sequence or sequences of DNA that function when in a relatively fixed location in regard to the transcription start site. A promoter contains core elements required for basic interaction of RNA polymerase and transcription factors, and can contain upstream elements and response elements.

2. Viral Promoters and Enhancers

Preferred promoters controlling transcription from vectors in mammalian host cells can be obtained from various sources, for example, the genomes of viruses such as: polyoma, Simian Virus 40 (SV40), adenovirus, retroviruses, hepatitis-B virus and most preferably cytomegalovirus, or from heterologous mammalian promoters, e.g. beta actin promoter. The early and late promoters of the SV40 virus are conveniently obtained as an SV40 restriction fragment which also contains the SV40 viral origin of replication (Fiers et al., Nature, 273: 113 (1978)). The immediate early promoter of the human cytomegalovirus is conveniently obtained as a HindIII E restriction fragment (Greenway, P. J. et al., Gene 18: 355-360 (1982)). Of course, promoters from the host cell or related species also are useful herein.

Enhancer generally refers to a sequence of DNA that functions at no fixed distance from the transcription start site and can be either 5′ (Laimins, L. et al., Proc. Natl. Acad. Sci. 78: 993 (1981)) or 3′ (Lusky, M. L., et al., Mol. Cell Bio. 3: 1108 (1983)) to the transcription unit. Furthermore, enhancers can be within an intron (Banerji, J. L. et al., Cell 33: 729 (1983)) as well as within the coding sequence itself (Osborne, T. F., et al., Mol. Cell Bio. 4: 1293 (1984)). They are usually between 10 and 300 bp in length, and they function in cis. Enhancers function to increase transcription from nearby promoters. Enhancers also often contain response elements that mediate the regulation of transcription. Promoters can also contain response elements that mediate the regulation of transcription. Enhancers often determine the regulation of expression of a gene. While many enhancer sequences are now known from mammalian genes (globin, elastase, albumin, α-fetoprotein and insulin), typically one will use an enhancer from a eukaryotic cell virus. Preferred examples are the SV40 enhancer on the late side of the replication origin (bp 100-270), the cytomegalovirus early promoter enhancer, the polyoma enhancer on the late side of the replication origin, and adenovirus enhancers.

The promotor and/or enhancer can be specifically activated either by light or specific chemical events which trigger their function. Systems can be regulated by reagents such as tetracycline and dexamethasone. There are also ways to enhance viral vector gene expression by exposure to irradiation, such as gamma irradiation, or alkylating chemotherapy drugs.

It is preferred that the promoter and/or enhancer region be active in all eukaryotic cell types. A preferred promoter of this type is the CMV promoter (650 bases). Other preferred promoters are SV40 promoters, cytomegalovirus (full length promoter), and retroviral vector LTF.

It has been shown that all specific regulatory elements can be cloned and used to construct expression vectors that are selectively expressed in specific cell types such as melanoma cells. The glial fibrillary acetic protein (GFAP) promoter has been used to selectively express genes in cells of glial origin.

Expression vectors used in eukaryotic host cells (yeast, fungi, insect, plant, animal, human or nucleated cells) can also contain sequences necessary for the termination of transcription which can affect mRNA expression. These regions are transcribed as polyadenylated segments in the untranslated portion of the mRNA encoding tissue factor protein. The 3′ untranslated regions also include transcription termination sites. It is preferred that the transcription unit also contain a polyadenylation region. One benefit of this region is that it increases the likelihood that the transcribed unit will be processed and transported like mRNA. The identification and use of polyadenylation signals in expression constructs is well established. It is preferred that homologous polyadenylation signals be used in the transgene constructs. In a preferred embodiment of the transcription unit, the polyadenylation region is derived from the SV40 early polyadenylation signal and consists of about 400 bases. It is also preferred that the transcribed units contain other standard sequences alone or in combination with the above sequences improve expression from, or stability of, the construct.

3. Markers

The vectors can include nucleic acid sequence encoding a marker product. This marker product is used to determine if the gene has been delivered to the cell and once delivered is being expressed. Preferred marker genes are the E. Coli lacZ gene which encodes β-galactosidase and green fluorescent protein.

In some embodiments the marker can be a selectable marker. Examples of suitable selectable markers for mammalian cells are dihydrofolate reductase (DHFR), thymidine kinase, neomycin, neomycin analog G418, hydromycin, and puromycin. When such selectable markers are successfully transferred into a mammalian host cell, the transformed mammalian host cell can survive if placed under selective pressure. There are two widely used distinct categories of selective regimes. The first category is based on a cell\'s metabolism and the use of a mutant cell line which lacks the ability to grow independent of a supplemented media. Two examples are: CHO DHFR− cells and mouse LTK− cells. These cells lack the ability to grow without the addition of such nutrients as thymidine or hypoxanthine. Because these cells lack certain genes necessary for a complete nucleotide synthesis pathway, they cannot survive unless the missing nucleotides are provided in a supplemented media. An alternative to supplementing the media is to introduce an intact DHFR or TK gene into cells lacking the respective genes, thus altering their growth requirements. Individual cells which were not transformed with the DHFR or TK gene will not be capable of survival in non-supplemented media.

The second category is dominant selection which refers to a selection scheme used in any cell type and does not require the use of a mutant cell line. These schemes typically use a drug to arrest growth of a host cell. Those cells which would express a protein conveying drug resistance and would survive the selection. Examples of such dominant selection use the drugs neomycin, (Southern P. and Berg, P., J. Molec. Appl. Genet. 1: 327 (1982)), mycophenolic acid, (Mulligan, R. C. and Berg, P. Science 209: 1422 (1980)) or hygromycin, (Sugden, B. et al., Mol. Cell. Biol. 5: 410-413 (1985)). The three examples employ bacterial genes under eukaryotic control to convey resistance to the appropriate drug G418 or neomycin (geneticin), xgpt (mycophenolic acid) or hygromycin, respectively. Others include the neomycin analog G418 and puramycin.

E. Biosensor Riboswitches

Also disclosed are biosensor riboswitches. Biosensor riboswitches are engineered riboswitches that produce a detectable signal in the presence of their cognate trigger molecule. Useful biosensor riboswitches can be triggered at or above threshold levels of the trigger molecules. Biosensor riboswitches can be designed for use in vivo or in vitro. For example, biosensor riboswitches operably linked to a reporter RNA that encodes a protein that serves as or is involved in producing a signal can be used in vivo by engineering a cell or organism to harbor a nucleic acid construct encoding the riboswitch/reporter RNA. An example of a biosensor riboswitch for use in vitro is a riboswitch that includes a conformation dependent label, the signal from which changes depending on the activation state of the riboswitch. Such a biosensor riboswitch preferably uses an aptamer domain from or derived from a naturally occurring riboswitch.

F. Reporter Proteins and Peptides

For assessing activation of a riboswitch, or for biosensor riboswitches, a reporter protein or peptide can be used. The reporter protein or peptide can be encoded by the RNA the expression of which is regulated by the riboswitch. The examples describe the use of some specific reporter proteins. The use of reporter proteins and peptides is well known and can be adapted easily for use with riboswitches. The reporter proteins can be any protein or peptide that can be detected or that produces a detectable signal. Preferably, the presence of the protein or peptide can be detected using standard techniques (e.g., radioimmunoassay, radio-labeling, immunoassay, assay for enzymatic activity, absorbance, fluorescence, luminescence, and Western blot). More preferably, the level of the reporter protein is easily quantifiable using standard techniques even at low levels. Useful reporter proteins include luciferases, green fluorescent proteins and their derivatives, such as firefly luciferase (FL) from Photinus pyralis, and Renilla luciferase (RL) from Renilla reniformis.

G. Conformation Dependent Labels

Conformation dependent labels refer to all labels that produce a change in fluorescence intensity or wavelength based on a change in the form or conformation of the molecule or compound (such as a riboswitch) with which the label is associated. Examples of conformation dependent labels used in the context of probes and primers include molecular beacons, Amplifluors, FRET probes, cleavable FRET probes, TaqMan probes, scorpion primers, fluorescent triplex oligos including but not limited to triplex molecular beacons or triplex FRET probes, fluorescent water-soluble conjugated polymers, PNA probes and QPNA probes. Such labels, and, in particular, the principles of their function, can be adapted for use with riboswitches. Several types of conformation dependent labels are reviewed in Schweitzer and Kingsmore, Curr. Opin. Biotech. 12:21-27 (2001).

Stem quenched labels, a form of conformation dependent labels, are fluorescent labels positioned on a nucleic acid such that when a stem structure forms a quenching moiety is brought into proximity such that fluorescence from the label is quenched. When the stem is disrupted (such as when a riboswitch containing the label is activated), the quenching moiety is no longer in proximity to the fluorescent label and fluorescence increases. Examples of this effect can be found in molecular beacons, fluorescent triplex oligos, triplex molecular beacons, triplex FRET probes, and QPNA probes, the operational principles of which can be adapted for use with riboswitches.

Stem activated labels, a form of conformation dependent labels, are labels or pairs of labels where fluorescence is increased or altered by formation of a stem structure. Stem activated labels can include an acceptor fluorescent label and a donor moiety such that, when the acceptor and donor are in proximity (when the nucleic acid strands containing the labels form a stem structure), fluorescence resonance energy transfer from the donor to the acceptor causes the acceptor to fluoresce. Stem activated labels are typically pairs of labels positioned on nucleic acid molecules (such as riboswitches) such that the acceptor and donor are brought into proximity when a stem structure is formed in the nucleic acid molecule. If the donor moiety of a stem activated label is itself a fluorescent label, it can release energy as fluorescence (typically at a different wavelength than the fluorescence of the acceptor) when not in proximity to an acceptor (that is, when a stem structure is not formed). When the stem structure forms, the overall effect would then be a reduction of donor fluorescence and an increase in acceptor fluorescence. FRET probes are an example of the use of stem activated labels, the operational principles of which can be adapted for use with riboswitches.

H. Detection Labels

To aid in detection and quantitation of riboswitch activation, deactivation or blocking, or expression of nucleic acids or protein produced upon activation, deactivation or blocking of riboswitches, detection labels can be incorporated into detection probes or detection molecules or directly incorporated into expressed nucleic acids or proteins. As used herein, a detection label is any molecule that can be associated with nucleic acid or protein, directly or indirectly, and which results in a measurable, detectable signal, either directly or indirectly. Many such labels are known to those of skill in the art. Examples of detection labels suitable for use in the disclosed method are radioactive isotopes, fluorescent molecules, phosphorescent molecules, enzymes, antibodies, and ligands.

Examples of suitable fluorescent labels include fluorescein isothiocyanate (FITC), 5,6-carboxymethyl fluorescein, Texas red, nitrobenz-2-oxa-1,3-diazol-4-yl (NBD), coumarin, dansyl chloride, rhodamine, amino-methyl coumarin (AMCA), Eosin, Erythrosin, BODIPY®, Cascade Blue®, Oregon Green®, pyrene, lissamine, xanthenes, acridines, oxazines, phycoerythrin, macrocyclic chelates of lanthanide ions such as quantum dye™, fluorescent energy transfer dyes, such as thiazole orange-ethidium heterodimer, and the cyanine dyes Cy3, Cy3.5, Cy5, Cy5.5 and Cy7. Examples of other specific fluorescent labels include 3-Hydroxypyrene 5,8,10-Tri Sulfonic acid, 5-Hydroxy Tryptamine (5-HT), Acid Fuchsin, Alizarin Complexon, Alizarin Red, Allophycocyanin, Aminocoumarin, Anthroyl Stearate, Astrazon Brilliant Red 4G, Astrazon Orange R, Astrazon Red 6B, Astrazon Yellow 7 GLL, Atabrine, Auramine, Aurophosphine, Aurophosphine G, BAO 9 (Bisaminophenyloxadiazole), BCECF, Berberine Sulphate, Bisbenzamide, Blancophor FFG Solution, Blancophor SV, Bodipy F1, Brilliant Sulphoflavin FF, Calcien Blue, Calcium Green, Calcofluor RW Solution, Calcofluor White, Calcophor White ABT Solution, Calcophor White Standard Solution, Carbostyryl, Cascade Yellow, Catecholamine, Chinacrine, Coriphosphine O, Coumarin-Phalloidin, CY3.1 8, CY5.1 8, CY7, Dans (1-Dimethyl Amino Naphaline 5 Sulphonic Acid), Dansa (Diamino Naphtyl Sulphonic Acid), Dansyl NH—CH3, Diamino Phenyl Oxydiazole (DAO), Dimethylamino-5-Sulphonic acid, Dipyrrometheneboron Difluoride, Diphenyl Brilliant Flavine 7GFF, Dopamine, Erythrosin ITC, Euchrysin, FIF (Formaldehyde Induced Fluorescence), Flazo Orange, Fluo 3, Fluorescamine, Fura-2, Genacryl Brilliant Red B, Genacryl Brilliant Yellow 10GF, Genacryl Pink 3G, Genacryl Yellow 5GF, Gloxalic Acid, Granular Blue, Haematoporphyrin, Indo-1, Intrawhite Cf Liquid, Leucophor PAF, Leucophor SF, Leucophor WS, Lissamine Rhodamine B200 (RD200), Lucifer Yellow CH, Lucifer Yellow VS, Magdala Red, Marina Blue, Maxilon Brilliant Flavin 10 GFF, Maxilon Brilliant Flavin 8 GFF, MPS (Methyl Green Pyronine Stilbene), Mithramycin, NBD Amine, Nitrobenzoxadidole, Noradrenaline, Nuclear Fast Red, Nuclear Yellow, Nylosan Brilliant Flavin E8G, Oxadiazole, Pacific Blue, Pararosaniline (Feulgen), Phorwite AR Solution, Phorwite BKL, Phorwite Rev, Phorwite RPA, Phosphine 3R, Phthalocyanine, Phycoerythrin R, Polyazaindacene Pontochrome Blue Black, Porphyrin, Primuline, Procion Yellow, Pyronine, Pyronine B, Pyrozal Brilliant Flavin 7GF, Quinacrine Mustard, Rhodamine 123, Rhodamine 5 GLD, Rhodamine 6G, Rhodamine B, Rhodamine B 200, Rhodamine B Extra, Rhodamine BB, Rhodamine BG, Rhodamine WT, Serotonin, Sevron Brilliant Red 2B, Sevron Brilliant Red 4G, Sevron Brilliant Red B, Sevron Orange, Sevron Yellow L, SITS (Primuline), SITS (Stilbene Isothiosulphonic acid), Stilbene, Snarf 1, sulpho Rhodamine B Can C, Sulpho Rhodamine G Extra, Tetracycline, Thiazine Red R, Thioflavin S, Thioflavin TCN, Thioflavin 5, Thiolyte, Thiozol Orange, Tinopol CBS, True Blue, Ultralite, Uranine B, Uvitex SFC, Xylene Orange, and XRITC.

Useful fluorescent labels are fluorescein (5-carboxyfluorescein-N-hydroxysuccinimide ester), rhodamine (5,6-tetramethyl rhodamine), and the cyanine dyes Cy3, Cy3.5, Cy5, Cy5.5 and Cy7. The absorption and emission maxima, respectively, for these fluors are: FITC (490 nm; 520 nm), Cy3 (554 nm; 568 nm), Cy3.5 (581 nm; 588 nm), Cy5 (652 nm: 672 nm), Cy5.5 (682 nm; 703 nm) and Cy7 (755 nm; 778 nm), thus allowing their simultaneous detection. Other examples of fluorescein dyes include 6-carboxyfluorescein (6-FAM), 2′,4′,1,4,-tetrachlorofluorescein (TET), 2′,4′,5′,7′,1,4-hexachlorofluorescein (HEX), 2′,7′-dimethoxy-4′,5′-dichloro-6-carboxyrhodamine (JOE), 2′-chloro-5′-fluoro-7′,8′-fused phenyl-1,4-dichloro-6-carboxyfluorescein (NED), and 2′-chloro-7′-phenyl-1,4-dichloro-6-carboxyfluorescein (VIC). Fluorescent labels can be obtained from a variety of commercial sources, including Amersham Pharmacia Biotech, Piscataway, N.J.; Molecular Probes, Eugene, Oreg.; and Research Organics, Cleveland, Ohio.

Additional labels of interest include those that provide for signal only when the probe with which they are associated is specifically bound to a target molecule, where such labels include: “molecular beacons” as described in Tyagi & Kramer, Nature Biotechnology (1996) 14:303 and EP 0 070 685 B1. Other labels of interest include those described in U.S. Pat. No. 5,563,037; WO 97/17471 and WO 97/17076.

Labeled nucleotides are a useful form of detection label for direct incorporation into expressed nucleic acids during synthesis. Examples of detection labels that can be incorporated into nucleic acids include nucleotide analogs such as BrdUrd (5-bromodeoxyuridine, Hoy and Schimke, Mutation Research 290:217-230 (1993)), aminoallyldeoxyuridine (Henegariu et al., Nature Biotechnology 18:345-348 (2000)), 5-methylcytosine (Sano et al., Biochim. Biophys. Acta 951:157-165 (1988)), bromouridine (Wansick et al., J. Cell Biology 122:283-293 (1993)) and nucleotides modified with biotin (Langer et al., Proc. Natl. Acad. Sci. USA 78:6633 (1981)) or with suitable haptens such as digoxygenin (Kerkhof, Anal. Biochem. 205:359-364 (1992)). Suitable fluorescence-labeled nucleotides are Fluorescein-isothiocyanate-dUTP, Cyanine-3-dUTP and Cyanine-5-dUTP (Yu et al., Nucleic Acids Res., 22:3226-3232 (1994)). A preferred nucleotide analog detection label for DNA is BrdUrd (bromodeoxyuridine, BrdUrd, BrdU, BUdR, Sigma-Aldrich Co). Other useful nucleotide analogs for incorporation of detection label into DNA are AA-dUTP (aminoallyl-deoxyuridine triphosphate, Sigma-Aldrich Co.), and 5-methyl-dCTP (Roche Molecular Biochemicals). A useful nucleotide analog for incorporation of detection label into RNA is biotin-16-UTP (biotin-16-uridine-5′-triphosphate, Roche Molecular Biochemicals). Fluorescein, Cy3, and Cy5 can be linked to dUTP for direct labelling. Cy3.5 and Cy7 are available as avidin or anti-digoxygenin conjugates for secondary detection of biotin- or digoxygenin-labelled probes.

Detection labels that are incorporated into nucleic acid, such as biotin, can be subsequently detected using sensitive methods well-known in the art. For example, biotin can be detected using streptavidin-alkaline phosphatase conjugate (Tropix, Inc.), which is bound to the biotin and subsequently detected by chemiluminescence of suitable substrates (for example, chemiluminescent substrate CSPD: disodium, 3-(4-methoxyspiro-[1,2,-dioxetane-3-2′-(5′-chloro)tricyclo[3.3.1.13,7]decane]-4-yl)phenyl phosphate; Tropix, Inc.). Labels can also be enzymes, such as alkaline phosphatase, soybean peroxidase, horseradish peroxidase and polymerases, that can be detected, for example, with chemical signal amplification or by using a substrate to the enzyme which produces light (for example, a chemiluminescent 1,2-dioxetane substrate) or fluorescent signal.

Molecules that combine two or more of these detection labels are also considered detection labels. Any of the known detection labels can be used with the disclosed probes, tags, molecules and methods to label and detect activated or deactivated riboswitches or nucleic acid or protein produced in the disclosed methods. Methods for detecting and measuring signals generated by detection labels are also known to those of skill in the art. For example, radioactive isotopes can be detected by scintillation counting or direct visualization; fluorescent molecules can be detected with fluorescent spectrophotometers; phosphorescent molecules can be detected with a spectrophotometer or directly visualized with a camera; enzymes can be detected by detection or visualization of the product of a reaction catalyzed by the enzyme; antibodies can be detected by detecting a secondary detection label coupled to the antibody. As used herein, detection molecules are molecules which interact with a compound or composition to be detected and to which one or more detection labels are coupled.

I. Sequence Similarities

It is understood that as discussed herein the use of the terms homology and identity mean the same thing as similarity. Thus, for example, if the use of the word homology is used between two sequences (non-natural sequences, for example) it is understood that this is not necessarily indicating an evolutionary relationship between these two sequences, but rather is looking at the similarity or relatedness between their nucleic acid sequences. Many of the methods for determining homology between two evolutionarily related molecules are routinely applied to any two or more nucleic acids or proteins for the purpose of measuring sequence similarity regardless of whether they are evolutionarily related or not.

In general, it is understood that one way to define any known variants and derivatives or those that might arise, of the disclosed riboswitches, aptamers, expression platforms, genes and proteins herein, is through defining the variants and derivatives in terms of homology to specific known sequences. This identity of particular sequences disclosed herein is also discussed elsewhere herein. In general, variants of riboswitches, aptamers, expression platforms, genes and proteins herein disclosed typically have at least, about 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, or 99 percent homology to a stated sequence or a native sequence. Those of skill in the art readily understand how to determine the homology of two proteins or nucleic acids, such as genes. For example, the homology can be calculated after aligning the two sequences so that the homology is at its highest level.

Another way of calculating homology can be performed by published algorithms. Optimal alignment of sequences for comparison can be conducted by the local homology algorithm of Smith and Waterman Adv. Appl. Math. 2: 482 (1981), by the homology alignment algorithm of Needleman and Wunsch, J. MoL Biol. 48: 443 (1970), by the search for similarity method of Pearson and Lipman, Proc. Natl. Acad. Sci. U.S.A. 85: 2444 (1988), by computerized implementations of these algorithms (GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software Package, Genetics Computer Group, 575 Science Dr., Madison, Wis.), or by inspection.

The same types of homology can be obtained for nucleic acids by for example the algorithms disclosed in Zuker, M. Science 244:48-52, 1989, Jaeger et al. Proc. Natl. Acad. Sci. USA 86:7706-7710, 1989, Jaeger et al. Methods Enzymol. 183:281-306, 1989 which are herein incorporated by reference for at least material related to nucleic acid alignment. It is understood that any of the methods typically can be used and that in certain instances the results of these various methods can differ, but the skilled artisan understands if identity is found with at least one of these methods, the sequences would be said to have the stated identity.

For example, as used herein, a sequence recited as having a particular percent homology to another sequence refers to sequences that have the recited homology as calculated by any one or more of the calculation methods described above. For example, a first sequence has 80 percent homology, as defined herein, to a second sequence if the first sequence is calculated to have 80 percent homology to the second sequence using the Zuker calculation method even if the first sequence does not have 80 percent homology to the second sequence as calculated by any of the other calculation methods. As another example, a first sequence has 80 percent homology, as defined herein, to a second sequence if the first sequence is calculated to have 80 percent homology to the second sequence using both the Zuker calculation method and the Pearson and Lipman calculation method even if the first sequence does not have 80 percent homology to the second sequence as calculated by the Smith and Waterman calculation method, the Needleman and Wunsch calculation method, the Jaeger calculation methods, or any of the other calculation methods. As yet another example, a first sequence has 80 percent homology, as defined herein, to a second sequence if the first sequence is calculated to have 80 percent homology to the second sequence using each of calculation methods (although, in practice, the different calculation methods will often result in different calculated homology percentages).

J. Hybridization and Selective Hybridization

The term hybridization typically means a sequence driven interaction between at least two nucleic acid molecules, such as a primer or a probe and a riboswitch or a gene. Sequence driven interaction means an interaction that occurs between two nucleotides or nucleotide analogs or nucleotide derivatives in a nucleotide specific manner. For example, G interacting with C or A interacting with T are sequence driven interactions. Typically sequence driven interactions occur on the Watson-Crick face or Hoogsteen face of the nucleotide. The hybridization of two nucleic acids is affected by a number of conditions and parameters known to those of skill in the art. For example, the salt concentrations, pH, and temperature of the reaction all affect whether two nucleic acid molecules will hybridize.

Parameters for selective hybridization between two nucleic acid molecules are well known to those of skill in the art. For example, in some embodiments selective hybridization conditions can be defined as stringent hybridization conditions. For example, stringency of hybridization is controlled by both temperature and salt concentration of either or both of the hybridization and washing steps. For example, the conditions of hybridization to achieve selective hybridization can involve hybridization in high ionic strength solution (6×SSC or 6×SSPE) at a temperature that is about 12-25° C. below the Tm (the melting temperature at which half of the molecules dissociate from their hybridization partners) followed by washing at a combination of temperature and salt concentration chosen so that the washing temperature is about 5° C. to 20° C. below the Tm. The temperature and salt conditions are readily determined empirically in preliminary experiments in which samples of reference DNA immobilized on filters are hybridized to a labeled nucleic acid of interest and then washed under conditions of different stringencies. Hybridization temperatures are typically higher for DNA-RNA and RNA-RNA hybridizations. The conditions can be used as described above to achieve stringency, or as is known in the art (Sambrook et al., Molecular Cloning: A Laboratory Manual, 2nd Ed., Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y., 1989; Kunkel et al. Methods Enzymol. 1987: 154:367, 1987 which is herein incorporated by reference for material at least related to hybridization of nucleic acids). A preferable stringent hybridization condition for a DNA:DNA hybridization can be at about 68° C. (in aqueous solution) in 6×SSC or 6×SSPE followed by washing at 68° C. Stringency of hybridization and washing, if desired, can be reduced accordingly as the degree of complementarity desired is decreased, and further, depending upon the G-C or A-T richness of any area wherein variability is searched for. Likewise, stringency of hybridization and washing, if desired, can be increased accordingly as homology desired is increased, and further, depending upon the G-C or A-T richness of any area wherein high homology is desired, all as known in the art.

Another way to define selective hybridization is by looking at the amount (percentage) of one of the nucleic acids bound to the other nucleic acid. For example, in some embodiments selective hybridization conditions would be when at least about, 60, 65, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100 percent of the limiting nucleic acid is bound to the non-limiting nucleic acid. Typically, the non-limiting nucleic acid is in for example, 10 or 100 or 1000 fold excess. This type of assay can be performed at under conditions where both the limiting and non-limiting nucleic acids are for example, 10 fold or 100 fold or 1000 fold below their kd, or where only one of the nucleic acid molecules is 10 fold or 100 fold or 1000 fold or where one or both nucleic acid molecules are above their kd.

Another way to define selective hybridization is by looking at the percentage of nucleic acid that gets enzymatically manipulated under conditions where hybridization is required to promote the desired enzymatic manipulation. For example, in some embodiments selective hybridization conditions would be when at least about, 60, 65, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100 percent of the nucleic acid is enzymatically manipulated under conditions which promote the enzymatic manipulation, for example if the enzymatic manipulation is DNA extension, then selective hybridization conditions would be when at least about 60, 65, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100 percent of the nucleic acid molecules are extended. Preferred conditions also include those suggested by the manufacturer or indicated in the art as being appropriate for the enzyme performing the manipulation.

Just as with homology, it is understood that there are a variety of methods herein disclosed for determining the level of hybridization between two nucleic acid molecules. It is understood that these methods and conditions can provide different percentages of hybridization between two nucleic acid molecules, but unless otherwise indicated meeting the parameters of any of the methods would be sufficient. For example if 80% hybridization was required and as long as hybridization occurs within the required parameters in any one of these methods it is considered disclosed herein.

It is understood that those of skill in the art understand that if a composition or method meets any one of these criteria for determining hybridization either collectively or singly it is a composition or method that is disclosed herein.

K. Nucleic Acids

There are a variety of molecules disclosed herein that are nucleic acid based, including, for example, riboswitches, aptamers, and nucleic acids that encode riboswitches and aptamers. The disclosed nucleic acids can be made up of for example, nucleotides, nucleotide analogs, or nucleotide substitutes. Non-limiting examples of these and other molecules are discussed herein. It is understood that for example, when a vector is expressed in a cell, that the expressed mRNA will typically be made up of A, C, G, and U. Likewise, it is understood that if a nucleic acid molecule is introduced into a cell or cell environment through for example exogenous delivery, it is advantageous that the nucleic acid molecule be made up of nucleotide analogs that reduce the degradation of the nucleic acid molecule in the cellular environment.

So long as their relevant function is maintained, riboswitches, aptamers, expression platforms and any other oligonucleotides and nucleic acids can be made up of or include modified nucleotides (nucleotide analogs). Many modified nucleotides are known and can be used in oligonucleotides and nucleic acids. A nucleotide analog is a nucleotide which contains some type of modification to either the base, sugar, or phosphate moieties. Modifications to the base moiety would include natural and synthetic modifications of A, C, G, and T/U as well as different purine or pyrimidine bases, such as uracil-5-yl, hypoxanthin-9-yl (I), and 2-aminoadenin-9-yl. A modified base includes but is not limited to 5-methylcytosine (5-me-C), 5-hydroxymethyl cytosine, xanthine, hypoxanthine, 2-aminoadenine, 6-methyl and other alkyl derivatives of adenine and guanine, 2-propyl and other alkyl derivatives of adenine and guanine, 2-thiouracil, 2-thiothymine and 2-thiocytosine, 5-halouracil and cytosine, 5-propynyl uracil and cytosine, 6-azo uracil, cytosine and thymine, 5-uracil (pseudouracil), 4-thiouracil, 8-halo, 8-amino, 8-thiol, 8-thioalkyl, 8-hydroxyl and other 8-substituted adenines and guanines, 5-halo particularly 5-bromo, 5-trifluoromethyl and other 5-substituted uracils and cytosines, 7-methylguanine and 7-methyladenine, 8-azaguanine and 8-azaadenine, 7-deazaguanine and 7-deazaadenine and 3-deazaguanine and 3-deazaadenine. Additional base modifications can be found for example in U.S. Pat. No. 3,687,808, Englisch et al., Angewandte Chemie, International Edition, 1991, 30, 613, and Sanghvi, Y. S., Chapter 15, Antisense Research and Applications, pages 289-302, Crooke, S. T. and Lebleu, B. ed., CRC Press, 1993. Certain nucleotide analogs, such as 5-substituted pyrimidines, 6-azapyrimidines and N-2, N-6 and O-6 substituted purines, including 2-aminopropyladenine, 5-propynyluracil and 5-propynylcytosine. 5-methylcytosine can increase the stability of duplex formation. Other modified bases are those that function as universal bases. Universal bases include 3-nitropyrrole and 5-nitroindole. Universal bases substitute for the normal bases but have no bias in base pairing. That is, universal bases can base pair with any other base. Base modifications often can be combined with for example a sugar modification, such as 2′-O-methoxyethyl, to achieve unique properties such as increased duplex stability. There are numerous United States patents such as U.S. Pat. Nos. 4,845,205; 5,130,302; 5,134,066; 5,175,273; 5,367,066; 5,432,272; 5,457,187; 5,459,255; 5,484,908; 5,502,177; 5,525,711; 5,552,540; 5,587,469; 5,594,121, 5,596,091; 5,614,617; and 5,681,941, which detail and describe a range of base modifications. Each of these patents is herein incorporated by reference in its entirety, and specifically for their description of base modifications, their synthesis, their use, and their incorporation into oligonucleotides and nucleic acids.

Nucleotide analogs can also include modifications of the sugar moiety. Modifications to the sugar moiety would include natural modifications of the ribose and deoxyribose as well as synthetic modifications. Sugar modifications include but are not limited to the following modifications at the 2′ position: OH; F; O-, S-, or N-alkyl; O-, S-, or N-alkenyl; O-, S- or N-alkynyl; or O-alkyl-O-alkyl, wherein the alkyl, alkenyl and alkynyl can be substituted or unsubstituted C1 to C10, alkyl or C2 to C10 alkenyl and alkynyl. 2′ sugar modifications also include but are not limited to —O[(CH2)nO]mCH3, —O(CH2)nOCH3, —O(CH2)nNH2, —O(CH2)nCH3, —O(CH2)n-ONH2, and —O(CH2)nON[(CH2)nCH3)]2, where n and m are from 1 to about 10.

Other modifications at the 2′ position include but are not limited to: C1 to C10 lower alkyl, substituted lower alkyl, alkaryl, aralkyl, O-alkaryl or O-aralkyl, SH, SCH3, OCN, Cl, Br, CN, CF3, OCF3, SOCH3, SO2CH3, ONO2, NO2, N3, NH2, heterocycloalkyl, heterocycloalkaryl, aminoalkylamino, polyalkylamino, substituted silyl, an RNA cleaving group, a reporter group, an intercalator, a group for improving the pharmacokinetic properties of an oligonucleotide, or a group for improving the pharmacodynamic properties of an oligonucleotide, and other substituents having similar properties. Similar modifications can also be made at other positions on the sugar, particularly the 3′ position of the sugar on the 3′ terminal nucleotide or in 2′-5′ linked oligonucleotides and the 5′ position of 5′ terminal nucleotide. Modified sugars would also include those that contain modifications at the bridging ring oxygen, such as CH2 and S. Nucleotide sugar analogs can also have sugar mimetics such as cyclobutyl moieties in place of the pentofuranosyl sugar. There are numerous United States patents that teach the preparation of such modified sugar structures such as U.S. Pat. Nos. 4,981,957; 5,118,800; 5,319,080; 5,359,044; 5,393,878; 5,446,137; 5,466,786; 5,514,785; 5,519,134; 5,567,811; 5,576,427; 5,591,722; 5,597,909; 5,610,300; 5,627,053; 5,639,873; 5,646,265; 5,658,873; 5,670,633; and 5,700,920, each of which is herein incorporated by reference in its entirety, and specifically for their description of modified sugar structures, their synthesis, their use, and their incorporation into nucleotides, oligonucleotides and nucleic acids.

Nucleotide analogs can also be modified at the phosphate moiety. Modified phosphate moieties include but are not limited to those that can be modified so that the linkage between two nucleotides contains a phosphorothioate, chiral phosphorothioate, phosphorodithioate, phosphotriester, aminoalkylphosphotriester, methyl and other alkyl phosphonates including 3′-alkylene phosphonate and chiral phosphonates, phosphinates, phosphoramidates including 3′-amino phosphoramidate and aminoalkylphosphoramidates, thionophosphoramidates, thionoalkylphosphonates, thionoalkylphosphotriesters, and boranophosphates. It is understood that these phosphate or modified phosphate linkages between two nucleotides can be through a 3′-5′ linkage or a 2′-5′ linkage, and the linkage can contain inverted polarity such as 3′-5′ to 5′-3′ or 2′-5′ to 5′-2′. Various salts, mixed salts and free acid forms are also included. Numerous United States patents teach how to make and use nucleotides containing modified phosphates and include but are not limited to, U.S. Pat. Nos. 3,687,808; 4,469,863; 4,476,301; 5,023,243; 5,177,196; 5,188,897; 5,264,423; 5,276,019; 5,278,302; 5,286,717; 5,321,131; 5,399,676; 5,405,939; 5,453,496; 5,455,233; 5,466,677; 5,476,925; 5,519,126; 5,536,821; 5,541,306; 5,550,111; 5,563,253; 5,571,799; 5,587,361; and 5,625,050, each of which is herein incorporated by reference its entirety, and specifically for their description of modified phosphates, their synthesis, their use, and their incorporation into nucleotides, oligonucleotides and nucleic acids.

It is understood that nucleotide analogs need only contain a single modification, but can also contain multiple modifications within one of the moieties or between different moieties.

Nucleotide substitutes are molecules having similar functional properties to nucleotides, but which do not contain a phosphate moiety, such as peptide nucleic acid (PNA). Nucleotide substitutes are molecules that will recognize and hybridize to (base pair to) complementary nucleic acids in a Watson-Crick or Hoogsteen manner, but which are linked together through a moiety other than a phosphate moiety. Nucleotide substitutes are able to conform to a double helix type structure when interacting with the appropriate target nucleic acid.



Download full PDF for full patent description/claims.




You can also Monitor Keywords and Search for tracking patents relating to this Riboswitches, methods for their use, and compositions for use with riboswitches patent application.
###
monitor keywords

Other recent patent applications listed under the agent Yale University:



Keyword Monitor How KEYWORD MONITOR works... a FREE service from FreshPatents
1. Sign up (takes 30 seconds). 2. Fill in the keywords to be monitored.
3. Each week you receive an email with patent applications related to your keywords.  
Start now! - Receive info on patent apps like Riboswitches, methods for their use, and compositions for use with riboswitches or other areas of interest.
###


Previous Patent Application:
Methods and compositions for diagnostic use in cancer patients
Next Patent Application:
Novel means for the diagnosis and therapy of ctcl
Industry Class:
Chemistry: molecular biology and microbiology

###

FreshPatents.com Support - Terms & Conditions
Thank you for viewing the Riboswitches, methods for their use, and compositions for use with riboswitches patent info.
- - - AAPL - Apple, BA - Boeing, GOOG - Google, IBM, JBL - Jabil, KO - Coca Cola, MOT - Motorla

Results in 3.51612 seconds


Other interesting Freshpatents.com categories:
Celera Genomics , Cingular Wireless , Colgate-Palmolive , Corning , g2