During most eukaryotic translation processes, the small 40S ribosome subunit scans an mRNA from its 5' end until it encounters the first start AUG codon. The large 60S ribosomal subunit then joins the smaller one to initiate protein synthesis. The location of the translation initiation is largely determined by the nucleotides near the start codon as there may be multiple translation initiation sites present on the mRNA. Marilyn Kozak discovered that the sequence RCCAUGG (where R stands for either adenine or guanine) is an optimal recognition sequence for translation initiation. The purine at -3 position and the guanine at +4 position are highly conserved throughout animal and plant species and regulate the initiation of protein synthesis. If the first start codon does not have a purine at -3 position and guanine at +4 position, then this sequence is in a weak context. For example, the peanut clump virus contains an RNA that encodes two proteins, p23 and p39. The first start codon is for p23 synthesis and has a weak recognition sequence, CUUAUGU. Around 30% of ribosomes will skip the first start codon and initiate translation at a downstream start codon instead, producing the second protein, p39. This initiation of translation at an alternative site is known as leaky scanning and has been observed in mRNAs of mammals, plants, and viruses.
The distance of the start codon from other elements in the transcript can also cause leaky scanning. If the first start codon is less than 12 nucleotides from the 5' end of the transcript, the first AUG may be skipped. This can also occur if two AUG start codons are closely spaced, as seen in segment 6 of the influenza virus B, where two start codons are separated by only 4 nucleotides.
Leaky scanning enables organisms to produce different isoforms of a protein when the two start codons are in the same reading frame. The glucocorticoid receptor gene from mammals is a good example of this type of leaky scanning where two different isoforms of the protein are produced – the larger 94 kDa GR1 and the smaller 91 kDa GR2. Despite its smaller size, GR2 is two times more efficient than GR1 in gene transactivation. On the other hand, if the first and downstream start codons have different reading frames, it can lead to the production of completely different proteins. For example, the segment 2 mRNA of the influenza A virus can encode 2 different proteins. The first protein is a core component of the viral polymerase which is necessary for virus replication; the second protein promotes apoptosis and is not essential for virus replication.