The present invention relates to a novel form of core+1 protein of Hepatitis C virus (HCV), designated shorter form core+1 protein. The invention also provides methods for detecting infection by Hepatitis C virus in biological samples, methods of screening compounds which interact with viral propagation in HCV infected cells and advantageously decrease inhibit or prevent viral propagation or screening of compounds impaction on the expression of shorter form core+1 protein and uses of these compounds for the preparation of compositions useful for their anti-viral activities. The invention also proposes to use the shorter form core+1 protein of the invention to derive immunogenic compositions for protection against HCV infection or against its consequences.
Hepatitis C is a viral infection of the liver which has also been referred to as “non A, non B hepatitis” (NANBH) until identification of the causative agent. Hepatitis C virus is one of the viruses (A, B, C, D and E), which together account for the majority of cases of viral hepatitis. Hepatitis C virus was first identified in 1989 (Choo et al. 1989) and defined as a common cause of liver disease with an estimated 170-million infected people worldwide. Hepatitis C virus (HCV) infection affects the liver, which causes hepatitis, i.e., an inflammation of the liver. 75 to 85% of persons infected with HCV progress to chronic infection, approximately 20% of these cases develop complications of chronic hepatitis C, including cirrhosis of the liver or hepatocellular carcinoma after 20 years of infection (Di Bisceglie 2000). The current recommended treatment for HCV infections is a combination of interferon and ribavirin drugs, however the treatment is not effective in all cases and the liver transplantation is indicated in hepatitis C-related end-stage liver disease. At present, there is no vaccine available to prevent HCV infection, therefore all precautions to avoid infection must be taken.
HCV is a (+) sense single-stranded enveloped RNA virus in the Hepacivirus genus within the Flaviviridae family. The viral genome is approximately 10 kb in length and encodes a 3011 amino acid polyprotein precursor. The HCV genome has a large single open reading frame (ORF) coding for a unique polyprotein, said polyprotein being co- and post-translationally processed by cellular and viral proteases into three structural protein, i.e., core, E1 and E2 and at least six non-structural NS2, NS3, NS4A, NS4B, NS5A and NS5B proteins (Houghton 1996 and Reed et al. 2000).
Initiation of translation of the HCV genome is controlled by an internal ribosome entry site (IRES) located mainly within the 5′-non coding region of the viral RNA, between nucleotides 42 and 341 or 356, the 3′ limit being controversial. The core protein, which forms the viral nucleocapsid, is predicted to be 191 amino acids in length and to have a molecular mass of 23 kDa (p23). Further processing of p23 produces the mature core protein (p21), consisting of between 173-182 amino acids. It has been previously reported that a protein having a molecular weight of about 17 kDa is also expressed from the core protein-coding sequence of some HCV isolates both in vitro and in vivo, e.g. in E. coli cells. This additional HCV polypeptide of 16/17 kDa (p16/p17), consisting of maximum 160 amino acids, is encoded by the open reading frame that overlaps the core gene in the +1 frame (core+1 ORF) and is syntheTized in vitro as a result of a +1 ribosomal frameshift for translation.
This 16/17 kDa polypeptide is named ARFP for Alternative Reading Frame Protein or F for Frameshift protein or core+1 according to the location of this novel protein. The ARFP/F/core+1 protein is synthetized in vitro from the initiator codon of the polyprotein sequence followed by a +1 ribosomal frameshift operating in the region of core codons 8-14 (Xu et al. 2001, Varaklioti et al. 2002).
More recently, the expression of the core+1 protein coding sequence has been assayed in mammalian cells, i.e. in vivo, in order to investigate the biological importance of the core+1 protein. It has been shown that expression of the core+1 ORF of HCV-1 and of HCV-1a (H) in rabbit reticulocyte lysates (in vitro) can be obtained respectively for HCV-1 isolate whereas it is not detected for HCV-1a(H) isolate (Varaklioti et al. 2002). Indeed, the core+1 protein has been synthesized in vitro when expressing core +1 ORF from HCV-1 but has not been detected when expressing core+1 ORF from HCV-1a (H). It is reminded that HCV-1 and HCV-1a(H) isolates of HCV, although belonging to the same genotype, have different sequences at the frameshift site located in codons 8-14 of HCV-1. The difference especially consists in the lack of the 10-A nucleotide residues in the HCV-1a(H) sequence at the putative frameshift site. In order to provide some data on expression mechanisms of core+1 protein the inventors have studied said expression in vivo.
The results disclosed in the present invention indicate that, unlike to the in vitro expression studies, both HCV-1 and HCV-1a (H) core coding sequences efficiently allow expression of the core+1 ORF in transfected mammalian cells. The transfection and expression experiments carried out in mammalian cells have also enabled the present inventors to identify that in vivo expression of core+1 ORF is associated with synthesis of a new protein which expression follows a new alternative translation initiation mechanism of core+1 ORF when compared to the mechanism identified for the in vitro expression of core+1 protein. Said alternative mechanism directs the synthesis of a shorter form of core+1 protein, in vivo.
Particular species of HCV-1 and HCV-1a (H) have been disclosed, respectively, in Genebank under references No. M62321 and No. M67463.
Viruses, which are subject to genome size constraints have developed different strategies to expand their coding capacity, such as ribosomal frameshifting or internal translational initiation. The ribosomal frameshifting consists in avoiding a termination codon, which would otherwise have been encountered by the ribosome, and instead creates a protein with extra amino acid sequences at its C terminal end. Therefore, in ribosomal frameshifting a directed change of translational reading frame allows the synthesis of a single protein from two or more overlapping genes. The internal translational initiation consists in escaping from an upstream initiator codon according to different mechanisms including leaky-scanning and ribosome shunting and internal ribosome entry site. Such a mechanism is apparently used for in vivo expression of shorter form core+1 protein.
The invention thus provides a new protein of HCV life cycle, which is designated shorter form core+1 protein and which can be obtained by in vivo expression of the core+1 coding sequence or ORF, especially in mammalian cells.
The invention also relates to nucleic acid sequences encoding said shorter form core+1 protein.
The invention also provides methods for detecting in a biological sample of an individual the presence or absence of the shorter form core+1 protein giving evidence of Hepatitis C virus infection.
The invention also provides use of the shorter form core+1 protein of the invention in an immunogenic composition. An immunogenic composition of the invention may advantageously be prepared in order to elicit a CTL response against HCV infection, in a patient.
The shorter form core+1 HCV protein may also be involved in the preparation of therapeutic composition aiming at interacting with the consequences of HCV infection, especially when persistent infection appears.
The invention also provides means for screening compounds, especially compounds having antiviral activity, as a result of interaction with in vivo expression of the core+1 ORF directing translation of shorter form core+1 protein. Among the several advantages of the present methods, it should be noted that these screening methods are appropriate for routine high throughput screening of compounds capable of interacting with viral propagation and control of life cycle of the virus especially capable of inhibiting or preventing viral propagation.
Moreover, the invention also provides for the use of the compounds capable of interacting with viral propagation and control of life cycle of the virus, especially compounds capable of inhibiting or preventing viral propagation, advantageously as a result of their capacity to interact with expression of shorter form core+1 protein in HCV infected cells, which compounds would be useful for the preparation of a drug for the treatment of disorders induced by or associated with infection of Hepatitis C virus.
A first object of the invention is thus a shorter form core+1 protein of HCV which is the product of translation of a coding sequence consisting of all or part of nucleotide sequence extending from nucleotide 598 to nucleotide 920 within the core +1 ORF of HCV represented on FIG. 3B.
In a particular embodiment, the shorter form core+1 protein which is encoded by a nucleotide sequence having a translation initiation codon (ATG) at position 598 or by a nucleotide sequence having an ATG at position 606 of the HCV core+1 coding sequence.
In a particular embodiment, the shorter form core+1 protein is encoded by:
(i) a nucleotide sequence extending from nucleotide 598 to nucleotide 826 of the sequence represented on FIG. 3B; or
(ii) a nucleotide sequence extending from nucleotide 598 to nucleotide 897 of the sequence represented on FIG. 3B; or
(iii) a nucleotide sequence extending from nucleotide 606 to nucleotide 826 of the sequence represented on FIG. 3B; or
(iv) a nucleotide sequence extending from nucleotide 606 to nucleotide 897 of the sequence represented on FIG. 3B; or
(v) a nucleotide sequence extending from nucleotide 606 to nucleotide 920 of the sequence represented on FIG. 3B.
As used herein, the expression “shorter form core+1 protein”, or “in vivo core+1 protein” refer to the Hepatitis C virus proteins obtainable in vivo, in cells infected with HCV, or in cells transfected with a DNA construct comprising core coding sequence or core+1 ORF. A predominant shorter form of core+1 is especially produced in vivo which is smaller than the 16/17 kDa core+1 in vitro synthesized product, as it is predicted to have a calculated molecular weight of less than 10 kDa. Furthermore, the shorter form core+1 protein does not contain the first 10 consecutive A residues of the core protein. These A residues are located codons 8-11 (nucleotides 364-373) of the HCV-1 genome and have a great importance on the expression of the core+1 ORF. This specific difference of molecular weight explains the term “shorter form core+1 protein”.
As used herein, the expression “core +1 ORF” refers to the nucleotide sequence such as represented FIG. 3B of the present application which is comprised within the “core coding sequence” of HCV. Said core +1 ORF, begins at nucleotide 342 with translation initiation codon and extends up to nucleotide at position 920 (U.S. Ser. No. 09/644,987) in the sequence illustrated on FIG. 3B.
It is pointed out that shorter form core+1 protein is encoded by core+1 ORF or by core coding sequence, when said nucleotide sequences are expressed in vivo.
The invention relates further to a shorter form core+1 protein of HCV which is obtainable in vivo by expression of the core+1 open reading frame (ORF) which is contained in nucleotide sequence extending from nucleotide at position 342 to nucleotide at position 920, preferably to nucleotide at position 826 of the nucleotide sequence represented on FIG. 3B and which calculated molecular weight is less than 10 kDa.
It is emphasized that shorter form core+1 protein is obtainable in vivo independently of the expression of the HCV polyprotein and also independently of the expression of core+1 protein. Said expression in vivo uses the same frame as the one used for core+1 expression in the core+1 ORF but does not involve the frameshift transfection mechanism required for core+1 in vitro expression.
In an other embodiment, the shorter form core+1 protein is the expression product of the core+1 ORF in mammalian cells.
In a preferred embodiment, the shorter form core +1 protein is recognized by a serum of patients infected with HCV. In the same way circulating anti-core+1 antibodies have been detected in HCV-infected individuals, suggesting that this protein is produced during natural HCV infection.