COMS 4701
FINAL PRACTICE EXAM
SPRING 2024
1.Consider a second-order hidden Markov model, in which Xt generally depends on both Xt-1 and Xt-2.The initial distribution is Pr(Xo,X₁),transition probabilities are Pr(Xt|Xt-1,Xt-2)for t≥2, and observation probabilities are Pr(E|Xt)for t≥1.
(a)Circle either true or false for each of the conditional independence statements below that are guaranteed to hold in the second-order HMM.
(b) Give a minimal expression for Pr(X₁,…,X₅,e₁,…,e₅) using the HMM parameters. (Multiplica- tion of CPTs will be interpreted as multiplication of factors.)
(c) Suppose we have αt=Pr(Xt-1,Xt|e1:t)and we want to compute αt+1 =Pr(Xt,Xt+1|e1:t+1). Give a minimal expression for αt+1 using at and the HMM parameters,normalizing if necessary.
2. Flying during the holidays can be a stressful time,since so many things can go wrong. Bad weather (W) or mechanical airplane problems (M) can delay your flight (D); mechanical problems can also affect the chances of your baggage (B) being lost. Suppose you have a probabilistic model of the relationships between these Boolean events as follows:
(a) Draw a representative Bayesian network of this model. Be sure to label your nodes and indicate directionalityon the edges.
(b) Are weather (W) and whether your baggage (B) makes it back safely with you independent of each other?
(c) Suppose you are sitting at the airport and you tell your family that your fight was indeed delayed. Given this information,are weather and baggage arriving safely conditionally independent of each other?
(d) Write an analytical expression for Pr(W,B|D=+d), the joint distribution of weather and baggage given that your flight is delayed.Your expression should only include sums,products, and/or quotients of terms fro the model described above.
(e) Numerically compute Pr(+w,+b,+d),the joint probability that bad weatheroccurred,your bag- gage got lost,and your flight was delayed.
3. A recycling robot is trying to classify the objects that it sees as bottles(B=+b)or notbottles (B=-b).The robot considers three binary features:whether the object is rounded(R=+r)or not (R=-r),whether it is made of glass(G=+9)or plastic(G=-9),and whether it is small (S=+8) or large(S=-s).The robot is given a labeled data set as follows:
(a) Suppose we learn a naive Bayes classifier from this data.Find the numerical parameters that would be learned usingα=1 smoothing. Please write your answers as reduced fractions.
(b) Using the learned model,how does the robot classify the feature set (一r,-g,-s)?
(c) Suppose our data set did not include the class labels.If we were to learn a naive Bayes model using expectation-maximization,are we guaranteed to recover the maximum-likelihood parameters learned from the labeled data set?Why or why not?
(d) Convert the features to numerical values by treating +as +1 and-as-1.Consider a linear classifier that predicts B=-b if fw(x)≤0 and B=+b otherwise.What is the classification accuracy on the data set given a model with weight vector w=(1,1,0,1)?
(e) Again starting from w,compute the update made to w using the perceptron learning rule after the first mistake made on the data set.
(f) A sigmoid activation function would still yield the same predictions and same classification accu- racy as the hard threshold function described above.Give two different advantages that a sigmoid function has over the hard threshold.
(g) Suppose we pass our data set through the neural network below,where x is R,y is G,and z is S. Find the individual outputs of each forward pass