The implementation of systematic testing for hepatitis B virus (HBV) has been instrumental in eliminating this virus from the blood supply. Nevertheless, a significant number of post-transfusion hepatitis (PTH) cases still occur. These cases are generally attributable to non-A, non-B to hepatitis (NANBH) virus(es), the diagnosis of which is usually made by exclusion of other viral markers.
The etiological agent responsible for a large proportion of these cases has recently been cloned (Choo, Q-L et al. Science (1988) 244:359-362) and a first-generation antibody test developed (Kuo, G. et al. Science (1989) 244:362-364). The agent has been identified as a positive-stranded RNA virus, and the sequence of its genome has been partially determined. Studies suggest that this virus, referred to subsequently as hepatitis C virus (HCV), may be related to flaviviruses and pestiviruses. A portion of the genome of an HCV isolated from a chimpanzee (HCVCDC/CHI) is disclosed in EPO 88310922.5. The coding sequences disclosed in this document do not include sequences originating from the 5′-end of the viral genome which code for putative structural-proteins. Recently however, sequences derived from this region of the HCV genome have been published (Okamoto, H. et al., Japan J. Exp. Med. 60:167-177, 1990.). The amino acid sequences encoded by the Japanese clone HC-J1 were combined with the HCVCDC/CHI sequences in a region where the two sequences overlap to generate the composite sequence depicted in FIG. 1. Specifically, the two sequences were joined at glycine451 . It should be emphasized that the numbering system used for the HCV amino acid sequence is not intended to be absolute since the existence of variant HCV strains harboring deletions or insertions is highly probable. Sequences corresponding to the 5′ end of the HCV genome have also recently been disclosed in EPO 90302866.0.
In order to detect potential carriers of HCV, it is necessary to have access to large amounts of viral proteins. In the case of HCV, there is currently no known method for culturing the virus, which precludes the use of virus-infected cultures as a source of viral antigens. The current first-generation antibody test makes use of a fusion protein containing a sequence of 363 amino acids encoded by the HCV genome. It was found that antibodies to this protein could be detected in 75 to 85% of chronic NANBH patients. In contrast, only approximately 15% of those patients who were in the acute phase of the disease, had antibodies which recognized this fusion protein (Kuo, G. et al. Science (1989) 244:362-364). The absence of suitable confirmatory tests, however, makes it difficult to verify these statistics. The seeming similarity between the HCV genome and that of flaviviruses makes it possible to predict the location of epitopes which are likely to be of diagnostic value. An analysis of the HCV genome reveals the presence of a continuous long open reading frame. Viral RNA is presumably translated into a long polyprotein which is subsequently cleaved by cellular and/or viral proteases. By analogy with, for example, Dengue virus, the viral structural proteins are presumed to be derived from the amino-terminal third of the viral polyprotein. At the present time, the precise sites at which the polyprotein is cleaved can only be surmised. Nevertheless, the structural proteins are likely to contain epitopes which would be useful for diagnostic purposes, both for the detection of antibodies as well as for raising antibodies which could subsequently be used for the detection of viral antigens. Similarly, domains of nonstructural proteins are also expected to contain epitopes of diagnostic value, even though these proteins are not found as structural components of virus particles.
BRIEF DESCRIPTION OF THE DRAWINGS
- Top of Page
FIG. 1 shows the amino acid sequence of the composite HCVHC-J1/CDC/CHI
FIG. 2 shows the antibody binding to individual peptides and various mixtures in an ELISA assay
DESCRIPTION OF THE SPECIFIC EMBODIMENTS
It is known that RNA viruses frequently exhibit a high rate of spontaneous mutation and, as such, it is to be expected that no two HCV isolates will be completely identical, even when derived from the same individual. For the purpose of this disclosure, a virus is considered to be the same or equivalent to HCV if it exhibits a global homology of 60 percent or more with the HCVHC-J1/CDC/CHI composite sequence at the nucleic acid level and 70 percent at the amino acid level.
Peptides are described which immunologically mimic proteins encoded by HCV. In order to accommodate strain-to-strain variations in sequence, conservative as well as non-Conservative amino acid substitutions may be made. These will generally account for less than 35 percent of a specific sequence. It may be desirable in cases where a peptide corresponds to a region in the HCV polypeptide which is highly polymorphic, to vary one or more of the amino acids so as to better mimic the different epitopes of different viral strains.
The peptides of interest will include at least five, sometimes six, sometimes eight, sometimes twelve, usually fewer than about fifty, more usually fewer than about thirty-five, and preferably fewer than about twenty-five amino acids included within the sequence encoded by the HCV genome. In each instance, the peptide will preferably be as small as possible while still maintaining substantially all of the sensitivity of the larger peptide. It may also be desirable in certain instances to join two or more peptides together in one peptide structure.
It should be understood that the peptides described need not be identical to any particular HCV sequence, so long as the subject compounds are capable of providing for immunological competition with at least one strain of HCV. The peptides may therefore be subject to insertions, deletions, and conservative or non-conservative amino acid substitutions where such changes might provide for certain advantages in their use.
Substitutions which are considered conservative are those in which the chemical nature of the substitute is similar to that of the original amino acid. Combinations of amino acids which could be considered conservative are Gly, Ala; Asp, Glu; Asn, Gln; Val, Ile, Leu; Ser, Thr, Lys, Arg; and Phe, Tyr.
Furthermore, additional amino acids or chemical groups may be added to the amino- or carboxyl terminus for the purpose of creating a “linker arm” by which the peptide can conveniently be attached to a carrier. The linker arm will be at least one amino acid and may be as many as 60 amino acids but will most frequently be 1 to 10 amino acids. Tne nature of the attachment to a solid phase or carrier need not be covalent.
Natural amino acids such as cysteine, lysine, tyrosine, glutamic acid, or .aspartic acid may be added to either the amino- or carboxyl terminus to provide functional groups for coupling to a solid phase or a carrier. However, other chemical groups such as, for example, biotin and thioglycolic acid, may be added to the termini which will endow the peptides with desired chemical or physical properties. The termini of the peptides may also be modified, for example, by N-terminal acetylation or terminal carboxy-amidation. The peptides of interest are described in relation to the composite amino acid sequence shown in FIG. 1. The amino acid sequences are given in the conventional and universally accepted three-letter code. In addition to the amino acids shown, other groups are defined as follows: Y is, for example, NH2, one or more N-terminal amino acids, or other moieties added to facilitate coupling. Y may itself be modified by, for example, acetylation. Z is a bond, (an) amino acid(s), or (a) chemical group(s) which may be used for linking. X is intended to represent OH, NH2, or a linkage involving either of these two groups.
Peptide I corresponds to amino acids 1 to 20 and has the following amino acid sequences: