Pan-Hantavirus CD4+ Epitope Identification for Broad-Spectrum Vaccine Design

Using immunoinformatics approaches to identify conserved CD4+ epitopes for potential pan-hantavirus vaccine constructs.

Independent Research Vaccine Design Immunoinformatics 2021–2023

Project Motivation

Hantaviruses represent an emerging public health threat due to their high mortality rates, high intrinsic mutation rate, and ability to recombine. This necessitates the construction of broadly protective vaccines against all hantavirus strains.

This project explored whether conserved CD4+ epitopes could be identified across known hantavirus genomes with the long-term goal of informing pan-hantavirus vaccine design.

My Role

This was an independent research project conducted under the mentorship of Dr. Melissa Willis at the Academies of Loudoun.

My work focused on computational epitope discovery, sequence analysis, immunoinformatics pipeline development, and biological interpretation of candidate vaccine targets.

Approach

An immunoinformatics workflow was developed to identify conserved candidate CD4+ epitopes across indexed hantavirus genomes.

1

Genome Acquisition

Collection of indexed hantavirus genomes from NIH GenBank.

2

Sequence Alignment

Multiple sequence alignment to identify conserved regions across species using NCBI COBALT.

3

Epitope Isolation

Isolation of candidate 11+ mer conserved peptide sequences using EMBOSS cons.

4

MHC-II Binding Prediction

Computational prediction of CD4+ MHC-II binding potential using the Immune Epitope Database Tools.

5

Epitope Filtering

Screening for antigenicity, allergenicity, and toxicity using VaxiJen V.2.0, AllerTop V.2.0, and ToxinPred.

6

Candidate Selection

Determination of conserved candidate epitopes for future analysis.

Results

Computational analysis identified nine candidate conserved CD4+ epitopes that satisfied filtering criteria for predicted MHC-II binding, antigenicity, allergenicity, and toxicity.

Multiple sequence alignment of hantavirus genomes

Figure 1. Multiple sequence alignment of indexed hantavirus genomes used to identify conserved regions across hantavirus species.

Conserved Sequence
xxxxxxxxxxxlxLLxVLxxvxxxxxxxxRNVYELKLECPHTVxxxxGExxxxxxxxxVxGSVELPxIxLxEVxxxxxxSLKxIESSCNFDIHxSxxxxQxFTQVtWxKKAdxxxTxNASSTTFExxSsEVNLKGxxxxxxxLxxxxxCVIxxxIIExxxKxxxxRKTVICYDLSCNQTxxCKPTLHLIAPIxxxxxxxxxCxxMKSCLIxLGxxxxxxxxxxxxxxRIQVVYEKTYCVGMLxVEGKxCFxPxxTLxxxxxxxxxxxxSxxxxxxxxxxxxxxxxxxYDvxxxxxxxxxxxxTLPVxCFLxxxIaKKxxxxxxxxxxxxxxxxxxxxxxKIxExlEKIxxxxxkxxCTxxxxxENxxQGYYVCxIGxNSExIxVPSxDDxRSxExxIxxxxLSrMxxSPHGEDHxxxxxxDxxxxxxxSLRIAGxxxxIExxxxxKVxPxTESSDxLxxxQGIAFSGxPMYSSLxxSVLxKxDPxxKYVFSPGIIxxxxxPxxNxSxxxxxxxxxCDKKxLPLTWTGYxxIxIPGxxEKIxxxxxxxxxxxCTVFCTLSGPGASCEAYSExxxxxxxGIFNISSPTCLVNKxxRFRbSEQQIxFVCQRxxxxxxxxVDxDIVVYCxxNGQKKVILTKTLVIGQCIYTxTSLFSLLPxVAHSLAVELCVPGxHGWATIALLITFCFGWLLIPxITxIILKIxxxLKxIxxIxxxxxYNxESKFKxILEKIKEEYQKTMGSMVCDVCxxxxxxxxxxxKHECETxKELKAHKKSCPQGQCPYCMxxxExTESALQAHYKxVCKLTxxRFQEDLKKSIxxxxxxQxxGCxYRTLNIFRYKSRCYIxxVWIILLxIEsIIWAASAExxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxLEPxWNDxAxxxxxxxxxxxxxxxxxxxxxxxxHGVGxIPMxTDLELDFSLxxxxxxxxxxxxSSSxYTYRRKLxNPxNEEExIpFxHIQIEKQVIxAEVQxLGHWMDAxxxxNIKTAFHCYGACxxKYSYPWQTAxxxxxxxxCxFEKDYQYETxWGCNPxDxCPGVxxxxxxGTGCTACxGIYLDKLKxxxxxSVGxAYKIISLKYTRKVCVQLGxExxCKxIDSNDxxxxxxxCLVTxNVKVCMIGTVSKFxxQxGDTLLFLGPLExGGLIxKQWCTTTCQFGDPGDIMSxxNxGxxCPEYxGSFRKKCxFATTPVCEYxGNxVSGYKRMMATKDSFQSFNVTDxHxTxNKxxxxxxxxxxxxxLEWxDxxxxxxxxPDGxLRDHINIVVNxxxxxRDIxFxxxxxxxxxxxEDLSENPCKVxLqTxSIEGAWGSGVGFTLtCxVSLTECSxxTFLTSIKACDxxxAMCYGATSVTLVRGQNxTVxVxGKGGHSGSxxxxxkFKCCHDxDCSxxGLLASAPxxxxxxHLDRVTGxNQIDNDKVYDDGAPeCGIxCWFxKSGEWLmGILxNGNWMVVVVLIVILIISIILxSfLCPVRKxKKxxxxxxxxxxxxxxxxxxxx

Figure 2. Conserved protein sequence identified across indexed hantavirus genomes. 11+ mers highlighted in orange. Origin of relevant epitopes highlighted in blue.

ID Sequence MHC-II Binding IFN-γ Binding Antigenicity
R23 WMVVVVLVVILIISI 15.36 0.5503569 0.7584
R24 MVVVVLVVILIISII 14.44 0.70655736 0.5976
R25 VVVVLVVILIISIIL 8.714 0.83196286 0.5386
W35 HGWATIALLITFCFG 32.8 0.48341312 0.5922
W36 GWATIALLITFCFGW 36.2 0.86239409 1.0371
W37 WATIALLITFCFGWL 33.18 0.55717861 0.9842
W38 ATIALLITFCFGWLL 33.18 0.55775663 0.7756
W39 TIALLITFCFGWLLI 37.78 0.74836939 0.8847
W40 IALLITFCFGWLLIP 39.78 0.66385574 0.8248

Figure 3. Candidate conserved CD4+ epitopes following computational filtering.

Discussion & Limitations

The computational workflow successfully identified 9 candidate conserved CD4+ epitopes that satisfied filtering criteria for predicted MHC-II binding, antigenicity, allergenicity, and toxicity. These findings initially suggested that conserved regions capable of supporting broad-spectrum hantavirus vaccine design may exist across multiple hantavirus species.

However, subsequent analysis revealed that the majority of the conserved candidate sequences were localized within highly hydrophobic transmembrane regions associated with the viral glycoprotein and membrane interface. While these regions demonstrated strong sequence conservation, their hydrophobicity and membrane localization likely limit both synthesis feasibility and practical immunogenicity in vivo.

The location of the conserved sequences indicate that there is significant heterogeneity in glycoprotein structure, and that broad-spectrum vaccination is likely only possible for a smaller subset of hantaviruses. Vaccine efficacy is highly dependent on MHC-II and IFN-γ binding strength and an inability to find epitopes closer to the binding region indicate that a multi-epitope based vaccination strategy may not be the most effective.

Future Directions

Future work could focus on analyzing a more targeted set of clinically relevant hantavirus strains and evaluating whether functionally important viral motifs associated with infection could serve as more effective vaccine targets.

Additional investigation into the evolutionary dynamics and potential recombination patterns among hantavirus species may also help clarify the specific public health threats that may arise. This work could then be used to more effectively inform what defines a clinically relevant hantavirus strain.

Reflection

This project was my first major introduction to scientific research and computational biology. Through this work, I learned not only how computational methods can support therapeutic discovery, but also how scientific progress often comes from understanding why an approach does not work as expected.

More broadly, the project introduced me to scientific thinking, biological complexity, and the interdisciplinary nature of modern quantitative biology. It played a major role in shaping my interest in computational biology and systems-level approaches to research.

Selected References

Supplementary Links