Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2): Codon Usage and Replicative Fitness

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) codon usage, as shown by the polyprotein coding sequence, shows better translation potential in the human host when compared with human coronavirus OC43 (HCoV-OC43) codon usage. Such translational advantage might facilitate SARS-CoV-2 replication, immunogenicity, and pathogenicity, thus also accounting for the less harmful character of HCoV-OC43 infection.


Introduction
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection causes a respiratory syndrome with altered pulmonary and alveolar function that can evolve into acute respiratory insufficiency and death. 1 Progressive immuneassociated injury is a hallmark of SARS, 2 and alteration of the lung functions is possibly due to specific autoimmune crossreactions 3,4 against alveolar surfactant-related proteins, 5 with a higher titer of antibodies independently associated with a worse clinical classification. 6 In conflict, the human coronavirus OC43 (HCoV-OC43) generally relates to less serious disturbances as common cold. 7 Currently, the molecular determinants and the mechanisms that underlie such a different pathogenic load are unknown.
Based on previous reports 8, 9 suggesting that rare host codons can inhibit viral protein expression and favor viral latency, this study investigated the codon usage in HCoV-OC43 and SARS-CoV-2. Specifically, usage of the 61 amino acid (aa) specifying codons was analyzed in the HCoV-OC43 and SARS-CoV-2 polyprotein (aka orf1ab) ORF (open reading frame) and then was compared with the codon usage of the human ORFeome. 10 Main results are reported in ►Table 1, which shows the different usage of a set of eight codons, whereas full data for the 61 codons in Homo sapiens, HCoV-OC43, SARS-CoV-2, SARS-CoV, and Middle East respiratory syndrome coronavirus (MERS-CoV) are detailed in ►Supplementary Table S1 (online only).
►Table 1 describes the following: • Eight codons are often used in HCoV-OC43 polyprotein ORF but occur at a lesser extent in the H. sapiens ORFeome, with a human-to-viral usage ratio smaller than 1, that is, from the translational point of view, the human-to-viral usage ratio is unfavorable to HCoV-OC43 since the optimal ratio value for HCoV-OC43 polyprotein synthesis in the human host is approximately 1. 8,9,11,12 • The human-to-viral usage ratio remains suboptimal for translational expression in the three HCoV-OC43 isolates collected in 1987, 1990, and 2011, respectively. • Usage of the eight codons is lower in SARS-CoV-2 polyprotein ORF so that the human-to-viral usage ratio reaches values closer to approximately 1 and is more suitable for the viral polyprotein translation in the human host.
In summary, in the context of CoV polyprotein expression, ►Table 1 documents that eight codons are more often used in HCoV-OC43 polyprotein ORF than in the human ORFeome and might represent a translational constraint for HCoV-OC43 polyprotein expression, thus limiting HCoV-OC43 replication, diffusion, and pathogenicity, given the essential role of coronavirus polyprotein for generation of viral progeny. 13,14 Keywords Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) codon usage, as shown by the polyprotein coding sequence, shows better translation potential in the human host when compared with human coronavirus OC43 (HCoV-OC43) codon usage. Such translational advantage might facilitate SARS-CoV-2 replication, immunogenicity, and pathogenicity, thus also accounting for the less harmful character of HCoV-OC43 infection.
The possibility of translational block/delay emerges when considering the iteration of the eight suboptimal codons along the HCoV-OC43 polyprotein ORFs. Among many, two examples from the HCoV-OC43 polyprotein (isolate 2011) are the following aa sequences: (1) YDDVNASLFV-DYSNL that is coded by a row of codons (given capital) abundant in the viral polyprotein ORF but not in the human ORFeome, TAT-GAT-GAT-gtt-AAT-gct-AGT-TTG-TTT-gtg-GAT-TAT-AGT-AAT-TTG, and (2) IDDHRITSITSDKFDFII that is coded by the viral nucleotide sequence ATT-GAT-GAT-cat-CGT-atc-act-AGT-ATT-act-AGT-GAT-aag-TTT-GAT-TTT-ATT-ATT, where 12 codons out of 15 are suboptimal for translation in the human host. Clearly, the potential to be translated of such HCoV-OC43 coding sequences progressively and severely diminishes along the succession of the suboptimal codons.
To conclude, the data suggest a link between CoV codon usage and CoV pathogenicity in humans. Usage of relatively rare human codons and their clustering along viral sequences can represent a major translational block at the basis of HCoV-OC43 low expression, low immunogenicity, and low pathogenicity. Vice versa, the possibility to be translated is rescued in SARS-CoV-2 polyprotein ORF by the minor usage of rare human codons (►Table 1). When compared with HCoV-OC43, the higher translatability of SARS-CoV-2 correlates to a higher viral protein expression and a higher capability of evoking immune responses. Consequently, also, anamnestic immune responses of increased avidity and affinity can arise and lead to autoimmune pathologies in case of repeated exposures to SARS viruses, [15][16][17][18] according to the recently clarified phenomenon of immunologic memory imprinting, also known as Original Antigenic Sin. 19 More in general, the data further support the translational control of viral protein expression as a mechanism by which the human host can silence and tolerize viral invasion. 8,9 Hence, this study warns that microbiology methodologies such as codon optimization, insertion/modification of translational enhancers, or addition of viral vectors inter alia, can increase the expression, replicative fitness, diffusion, and pathogenicity of the infectious agents under study, thus altering the finely tuned equilibrium between immunogenicity and immunotolerance. 20,21 Funding None.