Problem 1 (30pts)
Derive weights for sequences
ACTA =seq z
ACTT =seq y
CGTT =seq x
AGAT =seq w
using Thompson, Higgins, and Gibson method
Use the outline below (a-d) to solve this problem
a) compute pairwise distances between sequences
b) apply UPGMA method to join sequences and consequently the clusters)
c) build phylogenetic tree
d) derive sequence weights
Problem 2 (10pts)
We assumed additive property when constructed UPGMA tree in problem 1.
What is limitation of this assumption (if any)?
The limitations associated with the additive property are that distances must be metric or ultrametric. Also the assumption must be met that the distances between each pair is equal to the sum of the branch lengths of the taxa in the pair. Also the distances in the distance matrix matches the branch lengths.
Problem 3 (20pts)
The protein sequence of bacterial species “B3” was used to blast against swissprot protein database. The query returned significant hits to four other bacterial proteins (B1,B2,B4, B5), and one protein in human genome (H). No other mammalian species have shown presence of protein that is similar to B3. Phylogenetic tree construction by several methods resulted in a tree shown below. Explain the presence of this gene in humans.
Problem 4 (10pts)
Describe technical and theoretical challenges associated with building phylogenetic trees.
Problem 5 (10pts)
Compare and contrast parsimony, maximum likelihood, UPGMA, and neighbor-joining methods
Problem 6 (20)
Create multiple sequence alignment and phylogenetic tree in R using ape and clustalw by following steps below:
1. Install clutalw (depending you your OS) on your computer using http://www.clustal.org/clustal2/ link
2. Open R. (all of the following steps will be implemented in R)
3. Set a working directory
4. Install package “ape” from your R session by typing:
intall.packages(“ape “)
5. Load “ape” package by typing
library(“ape “)
6. Read accession numbers of sequences you downloaded for Homework 2 from GenBank; this step rather for exercising purposes since you have already downloaded these sequences.
7. Save the result from step 6 as <new.fas> file
8. Run clustalw by typing:
system(paste(‘”path_to_YOUR_clustalw/clustalw2.exe” new.fas’))
9. Read alignment file (*aln) it should be in your working directory
10. Create phylogenetic tree using neighbor-joining method
11. Plot the tree