Sub-genomic analysis of Chikungunya virus E2 mutations in Pakistani isolates potentially modulating B-cell & T-Cell immune response

Background & Objectives: The Chikungunya virus (CHIKV) transmitted to the humans through Aedes species of the mosquitoes. In December 2016, a severe outbreak reported from Pakistan. However, there is no vaccine or anti-viral treatment currently available so host immune response against CHIKV gained significant interest. Therefore, this study was conducted to identify the mutations in CHIKV E2 region of currently circulating Pakistani strains & determine their potential immunogenicity in Pakistani population. Methods: It was a cross sectional study in which a total of 60 CHIKV PCR positive samples were collected from Molecular Department of Pathology, Dow University of Health Sciences (DUHS), Karachi during November 2017 to February 2018. CHIKV E2 gene was amplified by PCR & sequenced. Sequences were analyzed by using bioinformatic tools followed by epitope prediction in E2 sequences by In-silico immunoinformatic approach. Results: Several single nucleotide variations (SNVs) were identified in Pakistani isolates with six novel mutations in E2 sequences. Immunoinformatic analyses showed more proteasomal sites, CTL & B-Cell epitopes in Pakistani strains with respect to S27 prototype with 69.4% population coverage against these epitopes in Pakistan. The study also identified key mutations responsible for generation of unique epitopes and HLA restriction in Pakistani isolates. The strain specific mutations revealed the current outbreak was caused by ESCA.IOL lineage of CHIKV. Conclusion: The evolution of E2 protein in Pakistani strains has increased its immunogenicity in comparison to ancestral s27 strain. The identification of most immunogenic and conserved epitopes with high population coverage has high potential to be used in vaccine development against these local strains.


INTRODUCTION
CHIKV is single stranded positive sense RNA arbovirus which is transmitted through Aedes spp. of mosquitoes. It was first isolated from Tanzania in 1953. 1 Approximately 50%-97% of patients develop symptoms including acute fever and persistent arthralgia in the chronic phase. The ~11.8 kb genome of CHIKV encodes for three structural proteins (E1-E3), core protein and four non-structural proteins (NSP1-NSP4). Infection is usually self-limiting but in some patients severe joints pain can persist for several years. 2 Some studies also suspected its involvement in development of neonatal encephalopathies, fulminant hepatitis & neurological disorders. 3 Before 2000, sporadically outbreaks of naturally acquired infections were reported from many African countries. 4 Since 2000, the virus has re-emerged & caused severe epidemics of unequalled magnitude with more severe form of disease & further spread to other American regions. 5,6 In December 2016, a severe outbreak for the first time was officially reported to the WHO from the Malir district of Karachi Pakistan. More than 30,000 peoples were infected, based on the clinical investigations of which 4000 cases were already confirmed through qualitative RT-PCR. 7 Currently there is no approved treatment or vaccine available against CHIKV infection. The viral clearance is dependent upon the immunological response, which recognizes the epitopes of viral antigens. 8 Previous studies have showed that E2 is the most immunogenic part of CHIKV proteome. It is also important for the induction of early immune response in acute phase & produce long lasting postinfection immunity. 9 However, different studies have identified several epitopes spanning the entire E2 protein but these epitopes evolved as viral strains acquire different mutations in each population during the outbreak and the immune response also varies due to human leukocyte antigen (HLA) types coverage in each population. 10 Therefore, this study was conducted to identify the mutations in CHIKV E2 region of currently circulating Pakistani isolates & determine their potential immunogenicity in Pakistani population. using DreamTaq Green PCR Master Mix (ThermoScientific, U.S.). The amplified products were run on 1% Agarose and sent for commercial DNA sequencing after purification to Macrogen, Korea.

It was a cross sectional study approved from
The generated sequences aligned against the CHIKV-S27 strain (AF369024.2) and first published CHIKV-Pakistani isolated (MF740874) using MEGA7.These sequences were In-silico translated in to amino-acid sequences by using Expasy. To predict and compare the immunogenicity of CHIKV E2 gene of Pakistani isolates with S27 ancestral strain different immunoinformatic tools were used. Briefly, NetChop3.1 was utilized for the prediction of proteasomal degradation sites with 0.5 threshold. 11 CD8+T cell (CTL) epitopes with their restricting HLAs were predicted by CTLPred & nHLAPred respectively. 12 The population coverage for MHC-I binders for these epitopes were predicted by IEDB population coverage tool. 13 B-cells linear epitopes were predicted by using IEDB-BEpipred prediction tool. 14

RESULTS
Out of 60 positive patients, 34 (56.6%) were males & 26 (43.4%) were females while the mean age of patients were 33.5 years. The E2-gene was successfully amplified & sequenced from 30 patients only as other samples were not amplified possibly due to low viral load in the sample. The mutation analyses revealed 21 different mutations in CHIKV-E2 amino acid sequences among which 15 mutations were already observed in CHIKV-Pakistani isolate sequenced previously from early outbreak. Six novel mutations including C91S in two and D77H, E166Q, M366I, T367H, V370I is present in one sample respectively from this study only (Fig.1 obliterated but were present in s27-prototype strain (Fig.2). The gain/loss of sites were due to the acquisition of certain mutations in E2 region. Briefly mutations of C91S, N160T & E166Q created new sites on position no 83, 161 & 160 respectively in four samples of this study. In-contrary mutation of V370I obliterated three sites on position 370,376 & 377 in one sample.
To further elucidate the immunogenic properties of CHIKV-E2, CTL epitopes with restricting class-I HLAs were predicted by using CTLPred and nHLAPred tools respectively. The results showed that both groups have 15 epitopes however samples from this study have five unique epitopes including the three newly evolved epitopes & two modified epitopes due to point mutations. While s27-E2 sequence revealed one absolutely unique epitope which has been obliterated in Pakistani strains. The mutations resulted in intra-epitope variability also effected its HLA class-I restriction  ( Table-I). Briefly, SNV S375T in E2 sequences from Pakistani isolate allowed it to be restricted by HLA-B*5102 and HLA-B*5103 in addition to HLA-Cw*0401.Similarly, SNVs V264A and M267R also created new HLA-A24 restriction site in addition to HLA-A*0301 & HLA-Cw*0401 in S27-Prototype strain. Pakistani population coverage for MHC-I binders for these epitopes revealed 69.4% population coverage for these epitopes in Pakistani population (Fig.3).
To predict the potential antibody response, B-cell linear epitopes revealed 14 potential epitopes in comparison to 13 epitopes in s27-strain. Generation of one new epitope and intra-epitope variability in Pakistani sequences were the result of evolutionary acquisition of mutations in these strains over the years (Table-II).

DISCUSSION
CHIKV epidemic with evolution of novel mutations are the major global health. Mutations in the E1 protein has been proven to be the main reason for the epidemic potential of CHIKV reemergence due to altered vector specificity and infectivity. 15 Several second step mutations in E2 have been document that alter the fitness of CHIKV in mosquito vector. 16,17 In this study, Ae. albopictus adaptive mutation K252Q were not observed although it was reported from 8 Bilal Ahmed Khan et al. Epitopes specific for s27 and Pakistani isolates are marked red and blue respectively. Epitopes variation are marked with green Epitopes specific for s27 and Pakistani strains are marked red and blue respectively. CHIKV isolates sequenced from early outbreak in Pakistan. 18 Instead V264A were consistently present in all samples which enhances the fitness of CHIKV in Ae.aegypti. 19 This speculates some involvement of Ae. albopictus in the early outbreak and major contribution of Ae. aegypti as a main vector for CHIKV transmission in Pakistan. Most of the mutations observed in E2 were previously reported in ESCA strains around the world including eight Pakistani sequences from the early outbreak. Specifically, V386A & I211T were also observed in this study which is most frequently present in ECSA.IOL lineage thus concluding the current CHIKV outbreak in Pakistan was caused by CHIKV-ESCA.IOL lineage. We also observed six novel mutations in E2 which is known to be the most variable protein of CHIKV thus indicating further viral evolution in the community due to selective pressure. 20 This study also predicted the CHIKV-E2 antigen processing and presentation by In-silico approach. Cellular proteasome chops the antigens at preferred sites then these peptides bind with Class-I HLA and presented on cellular surface of APC where CTLs recognize them and elicit immune response. Mutations in up or down stream of epitopes can alter CTL response which may result in impaired protection against CHIKV infection or vice versa. 21,22 The evolutionary acquisition of mutations in CHIKV-E2 generated unique proteasomal cleavage sites (n=142 sites) in current CHIKV-Pakistani strains in comparison with ancestral S27-Strain (n=135 sites). The novel mutations created one new site while obliterated three sites which are not reported in early outbreak from Pakistan. However, increase in number of proteasomal sites in current strains indicates more effective antigen processing for presentation in comparison to ancestral strains. 23 The influence of these mutations on the generation of CTL epitopes showed three new epitopes & obliteration of one epitope in Pakistani strains. The intra-epitope variation also affected the HLA restriction of these epitopes. SNV S375T in epitope VSVASFILL of S27 allowed it to be restricted by two additional HLAs in current Pakistani strains. Similarly, SNVs V264A and M267R also generated additional MHC-I binding site (Table-I). The epitopes STKDNFNVY & SRPQHGKEL are found to be the most immunogenic in both strains indicating highly conserved nature of these sites throughout viral evolution. The currently circulating Pakistani strains also have one additional B-cell epitope and intra-epitope variability in others. The generation of epitope and variability on B-cell epitope on position 207 are most probably due to the mutation I211T while epitope variability on position 131 is also supposedly due to N160T and A164T as mutations in epitope regions are known to alter the immunogenicity of viruses 24,25 .Moreover, population coverage analysis also showed that these epitopes have 69.4% population coverage in Pakistani population which has never been evaluated before for CHIKV. This might be one of the possible reasons that the CHIKV outbreak was not sustained for longer period in Pakistan. Moreover, sequencing of only E2-gene Epitopes specific for s27 strain are marked red, while epitope specific for Pakistani strains are marked blue.