Signal sequences are short amino acid sequences that guide newly synthesized proteins to their proper location within the cell. Classical signal sequences are fifteen to sixty amino acids long and present at the N-terminus of a polypeptide chain. Each signal sequence has a conserved segment of basic residues towards their N terminus, a hydrophobic core, and a C-terminus rich in polar residues. The C-terminus also contains a signal cleavage site and features a -3 -1 sequence motif. The -3-1 sequence motif contains amino acids with short side chains such as alanine at -1 and uncharged residues at -3 positions, relative to the signal cleavage site (considered position 0).
Cellular organelles contain sorting receptors that recognize the sorting signals and guide the cargo into the correct compartment. Sorting receptors can be soluble such as the nuclear receptors, or membrane-bound, as observed in mitochondria, chloroplast, ER, and peroxisomes. After the proteins are delivered to their proper location, the sorting receptors are recycled back for multiple rounds of protein sorting.
Inside the organelle, signal peptidases cleave the signal sequences of the newly delivered protein at their signal cleavage site. Some signal sequences are present internally within the polypeptide and remain permanently associated without being cleaved off, as found in many nuclear proteins. Furthermore, some signal sequences are rich in hydrophobic amino acid residues that help to anchor transmembrane proteins. Such signal sequences are called signal-anchor sequences. Mutations or removal of signal sequences leads to defective routing of proteins and are associated with physiological conditions such as inherited kidney diseases, autoimmune diseases, cardiovascular diseases, and several metabolic disorders.