Humans have classified and organized biological organisms for thousands of years. Originally, primarily ordering objects necessary for survival. As human history progressed, so did the skill and detail in these classifications. In the fourth century B.C., Aristotle pioneered formal classifications, delineating plants and animals into different groups and then dividing these further based on their physical characteristics and traits like the habitats that they occupy. Later, in the mid 1700s, Linnaeus built on Aristotle's system. He called his highest level of grouping the kingdoms and, from there, divided the groups using synapomorphies, a defining physical feature which splits a branch. For example, if an animal possesses a backbone or a similar structure, it should be placed in the phylum chordata. If it doesn't, then there are many other phyla, which animals without backbones can be split into, including the arthropoda, a large group including insects. Linneaus continued splitting groups of organisms based on their synapomorphies at subsequent levels through the class, order, family, and genus, until reaching the final designation, generally, the species. We refer to Linnaeus' type of classification as cladistics, the classification of organisms based on differences in physical characteristics.

Today, scientists commonly construct trees called dendrograms to give visual representations of these splits and groups. This particular form of dendrogram, the cladogram, visualizes the cladistic relationships between the species so that the tips of the tree represent the species and the branches show how they're related to one another. For example, here the chimpanzee and bear are more closely related to one another and share more common characteristics than either of them do with the sunfish. The places where the branches meet are referred to as nodes and denote common ancestors for the species that follow. A second major dendrogram type is the phylogram. These are different from cladograms because the length of the branches between species varies, representing the degree of change between them. So the longer the branch, the more time has passed since the species diverged from their last common ancestor.

Dendrograms were constructed by simply analyzing the morphology of organisms. With the advent of modern technology, comparing DNA has also become a common way to build trees. DNA is made up of nucleotides associated with one of four different bases. Adenine, guanine, cytosine, or thymine. The order of these bases is the DNA code. This code is passed from parent to offspring. Consequently, if you look across a single species like humans, there is a very high degree of similarity in our genetic code, around 99.9%. We also share some of our DNA code with other species, like chimpanzees and mice, but the degree of overall similarity between our DNA and theirs is vastly different. This means that we can create trees, which group species based on the similarities or difference between their genetic codes. This field of analysis, combining statistics, mathematical modeling, and computer science, is referred to as bioinformatics. To compare DNA sequences, researchers often use a bioinformatics tool called the Basic Local Alignment Search Tool, or BLAST, which was created, and is maintained by, the National Center for Biotechnology Information.

In this laboratory, you will first create a cladogram of animals using morphological information, and then place a fossil species onto this cladogram based on its morphology. You will then use DNA sequences from several different modern-day relatives of the fossil and the BLAST database to verify your positioning of the fossil onto the tree.

Humans have been attempting to properly classify living things since Aristotle made the first attempt during the 4th century BC. Aristotle’s system was improved upon during the Renaissance and then, subsequently, by Carolus Linnaeus in the mid 1700’s. These more formal classification and organization systems grouped species by their physical similarity to one another. For example, all vertebrates have a backbone, but invertebrates do not. Traits like the backbone are called synapomorphies, which are traits that are shared by a group of organisms, presumably because they were derived from a common ancestor. As we will explore, this method has been shown to have limitations and has more recently been amended to include genetic analysis. Still, scientists construct trees called dendrograms to create a visual representation of how species are related to one another and share common ancestors. These dendrograms can aid in our understanding of the evolutionary processes that drive these relationships. Genetic comparisons have added an important tool guiding the analysis of evolutionary relationships.

Types of Dendrogram

A type of dendrogram, called a cladogram, depicts the hypothetical genealogical relationships between species with the tips (or leaves) of the tree representing a species and the branches showing how species are related to each other. A slightly more complicated type of tree, called a phylogram, differs from a cladogram in that the branches leading to the species are of different lengths. The length of a branch in this type of tree represents the degree of change between species: the longer the branch, the more time since the species have diverged from a common ancestor. In both types of tree, the common ancestor of a group of species is indicated by a node, which is the point where a series of branches meet. Species that are more closely related to each other (most recently shared a common ancestor) will be located closest to the node. The two species that share a node are called a sister group1.

Understanding Evolutionary Relationships using Genetic Data

Historically, cladograms were constructed by comparing the morphology (physical structure) of organisms. This method is still practiced but the techniques have been modernized to include comparison of DNA (deoxyribonucleic acid) sequences between species. Using DNA for building trees has several advantages over relying solely on morphology, including being able to calculate an estimate of how long ago different species shared a common ancestor1. However, using DNA is not always feasible, especially when trees include extinct organisms. DNA is best found in soft tissues, which are not preserved during the fossilization process, and therefore it is uncommon for a DNA sample of an extinct species to be available.

DNA is passed on from parents to their offspring in hereditary units called genes. The nucleotide (A, G, C, and T) sequence of genes found in different species are frequently quite similar, likely due to their having come from a common ancestor. This fact allows researchers to align sequences from different species with one another to build the trees described above. Species with more similarity between their nucleotide sequences will be placed next to each other in a tree, and species with less sequence similarity will be placed further apart from each other.

Bioinformatics are the tools used by biologists to analyze large datasets using a combination of computer science, mathematical modeling, and statistics. One such tool is called BLAST (Basic Local Alignment Search Tool), which can be used to quickly search the entire genome of any species that is available in the NCBI (National Center for Biotechnology Information) database2. The NCBI database combines several different databases that hold different types of DNA sequence information. The process of a BLAST search includes complex computer algorithms, but basically, BLAST aligns sequences of each nucleotide base from a submitted DNA sequence (known as the query sequence) with sequences in the data base that most closely match it. The DNA sequences that are found will be listed in order of similarity to the sequence in question, and will therefore be from species closely related to the species containing the query gene. This comparison may or may not depict the actual evolutionary relationship between species because genes evolve at different rates. Additionally, genomes sometimes contain more than one instance of a similar sequence.

Comparison of DNA sequences of genes is valuable beyond consideration of evolutionary relationships. Frequently, genes are identified in model organisms, such as the fruit fly, Drosophila melanogaster, or the mouse3. Integral to studying a gene, the function of its product is commonly identified and analyzed. If a researcher is interested in studying that function in a different organism (humans for example), BLAST or other bioinformatic tools can be used to find candidate genes based on their similarities to the genes of known function from model organisms.

Human genes can also be used as the starting point to find homologs in model organisms. In fact, human disease research depends heavily on this. Once a human gene of interest is identified, mice can be genetically manipulated to have the homologous gene disrupted, or “knocked out,” creating a model of the human disease that can be studied in order to understand and treat the disease. There are many of these mouse strains currently available. For example, there is a mouse model for human Cystic Fibrosis (CF) called the Cftr knockout mouse and another modeling atherosclerosis, called the Apoe knockout3.


