Basic Genetics of Genomes

The basic genetic structure of all animal retroviruses is the same. They all contain retroviral RNA sequences that code for the same three genes abbreviated GAG, POL and ENV. Flanking each end of the retroviral genome is a sequence of similar nucleotides called long terminal repeats or redundancies (LTR).

What sets the HIV genome apart from all other known retroviruses is the number of genes in HIV and the apparent complexity of their interactions in regulating the expression of the GAG, POL and ENV genes.

The HIV genome contains at least nine recognizable genes: GAG, POL, ENV, VIF, TAT, REV, NEF, VPR, and VPU. Five of the nine genes are involved in regulating the expression of the GAG, POL and ENV genes.


GAG Gene

GAG stands for group-specific antigens (proteins) that make up the viral nucleocapsid. The GAG gene codes for the production of the dense cylindrical core proteins. The proteins include p24, a nucleoid shell protein, and several internal proteins including p7, p15, p17, and p55. The GAG protein has the ability to direct the formation of the virus-like particles when all other major genes (POL and ENV) are absent. The GAG protein has been designated the virus particle-making machine.


POL Gene

The POL gene codes for protease, p10, the virus associated polymerase that is active in two forms, p51 or p66, and endonuclease (integrase, p31) enzymes. The integrase enzyme cuts the cell’s DNA and inserts the HIV DNA. The loss of LTRs and the 3′ side of the POL gene stops viral DNA integration into the host genome. Nonintegrated DNA, without its LTRs and integrase enzyme, can still produce new viruses. Viral DNA integration is not essential for viral multiplication even though integration is part of the normal course of events.

The regulation of HIV transcription appears to be intimately related to the onset of HIV disease and AIDS. Interruption or inactivation of the POL gene appears to have therapeutic effects. Protease inhibitors are now a common drug on the market.


ENV Gene

The ENV gene codes for two major envelope glycoproteins, gp120 and gp41, that become embedded throughout the host cell membrane. gp41 is the transmembrane glycoprotein that attaches gp120 to the surface of HIV. Envelope glycoproteins enhance T helper cell death by causing the formation of synctia. In synctia, healthy T cells fuse to each other forming a group around a single HIV-infected T4 cell. Individual T cells within these synctia lose their immune function. Starting with a single HIV-infected T4 helper cell, as many as 500 uninfected T4 helper cells can fuse into a single synctium.


The Six Other Genes

The six additional genes, TAT, REV, NEF (regulatory genes) and VIF, VPU, VPR (auxiliary genes) working together with the host cell’s machinery actually control the reproductive retroviral cycle by:

  • Adhering to the cell
  • Penetrating the cell
  • Uncoating the HIV genome
  • Carrying out reverse transcription of the RNA genome, producing proviral DNA and immediately producing viral RNA out the integration of the provirus and later viral multiplication

The HIV proviral genome has been well characterized with regard to gene location and sequence, but the function of each gene is not completely understood. The genes for producing regulatory proteins can be grouped into two classes:

  • Genes that produce proteins essential for HIV replication: TAT and REV.
  • Genes that produce proteins that perform accessory functions that enhance replication and/or infectivity: VIF, NEF, VRP, and VPU.

VIF Gene

The VIF gene is associated with the infectious activity of the virus. VIF may also be involved in viral replication, but it doesn’t appear to influence the production of GAG-POL-ENV proteins.

TAT Gene

The TAT gene is one of the first vital genes to be transcribed. It produces a transactivator protein, meaning that the gene produces a protein that exerts its effect on viral replication from a distance rather than by interacting with the genes adjacent to TAT or their gene products. TAT contains two coding regions, or exons, which, through the help of the LTR sequences, increase the expression of HIV genes, thereby increasing the production of new virus particles.

The TAT gene:

  • consists of 86 amino acids and binds to cadmium or zinc.
  • interacts with a short nucleotide sequence called TAR located within the 5′ LTR Region of HIV messenger RNA transcripts. Once the TAT protein binds to the TAR sequence, transcription of the provirus by cellular RNA polymerase II accelerates at least 1,000 fold.
  • can increase the number of HIV particles produced, but may not be essential for viral replication.

REV Gene

The REV gene is the expression regulator of viral protein. It contains two exons that together code for a 116 amino acid protein. The REV gene selectively increases the synthesis of HIV structural proteins in the late stages of AIDS, thereby maximizing the production of new viruses. It regulates splicing of the HIV-RNA transcript and transports spliced and unspliced RNAs from the nucleus to the cytoplasm.

NEF Gene

The NEF gene is the negative regulator protein that produces a protein that is maintained in the cell cytoplasm next to the nuclear membrane. Supposedly, NEF makes the cell more capable of producing HIV.

VRP and VPU Genes

The functions of the VPR gene, which codes for viral protein R, and the VPU gene, which codes for viral protein U, are not completely understood. The VPU protein is required for the efficient assembly and release of new HIV viruses.