1. Maximum Likelihood Assume X is a discrete random variable that takes values in
{0, 1}. Assume we know in prior that
P(X = k) = ↵ exp(−k”), 8k 2 {0, 1},
where ↵ and ” are two parameters to be decided.
(a) What constraints should we put on ↵ and ” to ensure that we have a valid
distribution?
(b) Assume we observe a sequence D = {x1, x2, . . . ,xn} that is drawn independently
from the distribution of X; we assume the observed number of 0,1 in D are n0,
n1, respectively. Please write down the log-likelihood function as a function of
” and propose a method to estimate ” (it is enough to frame an optimization
problem without numerically solving it).
2
2. Bayesian Inference Assume we use a medical device to detect a type of rare cancer.
Denote by X 2 {0, 1} if the cancer actually exists on a patient and Y 2 {0, 1} the
output of the medical device. Denote by the false negative and false positive of the
device to be ↵ and #, respectively, that is,
↵ = P(Y = 0 | X = 1)
# = P(Y = 1 | X = 0),
where P denotes probability. In addition, denote by $ the probability that this cancer
happens in the population; this defines a prior distribution of X, that is, $ = P(X = 1).
(a) If the device claims cancer for a patient (that is, Y = 1), the posterior probability
that she actually has cancer is P(X = 1 | Y = 1). Please calculate P(X = 1 | Y =
1) in terms of ↵, # and $.
(b) Assume we have ↵ = $. What requirement on # is needed in order to achieve
P(X = 1 | Y = 1) $ 90%?
3
3. Multivariate Normal Distribution
Assume X = [X1,X2]> is a two-dimensional standard normal random variable,
X =
X1
X2
”
⇠ N
✓
0
0
”
,
1 0
0 1
“◆
Let Y = [Y1, Y2]> is obtained by a linear transform of X:
(
Y1 = 2X1 + ⇢X2
Y2 = X1 + ⇢X2,
where ⇢ is a real number constant.
1. What is the distribution of Y? Decide its mean and covariance matrix.
2. Does there exist a value of ⇢ such that Y1 and Y2 are independent with each other?
Please explain the reason.
4
4. Which of the following statements are true? Please explain your answer for each one.
(a) When training a neural network with 100 neurons using gradient descent or
stochastic gradient descent, if we initialize the weights of all the neurons to be the
same value, they will stay the same across the iterations (so e↵ectively, we train
a neural network with just a single neuron).
(b) Learning neural networks is a non-convex optimization problem, and gradient
descent algorithms are not guaranteed to find the global optima.
(c) Kernel regression is guaranteed to outperform linear regression in practice because
it allows us to fit more flexible nonlinear curves.
(d) Estimating the coefficients of kernel regression yields a non-convex optimization
problem, because it fits data with a non-linear curve.
(e) Expectation maximization (EM) is guaranteed to find the global optima of the loglikelihood
of Gaussian mixture models, but k-means can only find local optima.
5
5. K-Means
Let us practice k-means in this problem. Consider Figure (a) below where we have
six data points (blue circles), and we have chosen two initial centroid locations (red
squares). Please run k-means on this data set and plot the location of centroids at
each iteration in Figure (b)-(d) (if the algorithm converges within the first or second
iteration, there is no need to fill in the remaining figures).
0 1 2 3 4 5 6
0
1
2
3
4
w w w
ww w
w w w
ww w
(a) Initialization (b) Iteration 1
w w w
ww w
w w w
ww w
(c) Iteration 2 (d) Iteration 3
6
The result of k-means is not unique. Di↵erent initializations may yield di↵erent final
results. For example, Figure (a) below shows another possible clustering of the same
dataset. In Figure (b), we have initialized one of the centroids. Please initialize the
other centroid properly, so that k-means converges to the clustering result in Figure
(a). Show your initialization and the location of the centroids at each iteration of
K-means in Figure (b)-(f). Again, if your algorithm converges in less than 4 iterations,
you do not need to fill in the remaining figures.
0 1 2 3 4 5 6
0
1
2
3
4
w w w
ww w
w w w
ww w
(a) Desirable Result (b) Initialization
w w w
ww w
w w w
ww w
(c) Iteration 1 (d) Iteration 2
w w w
ww w
w w w
ww w
(e) Iteration 3 (f) Iteration 4
Get professional assignment help cheaply
Are you busy and do not have time to handle your assignment? Are you scared that your paper will not make the grade? Do you have responsibilities that may hinder you from turning in your assignment on time? Are you tired and can barely handle your assignment? Are your grades inconsistent?
Whichever your reason may is, it is valid! You can get professional academic help from our service at affordable rates. We have a team of professional academic writers who can handle all your assignments.
Our essay writers are graduates with diplomas, bachelor, masters, Ph.D., and doctorate degrees in various subjects. The minimum requirement to be an essay writer with our essay writing service is to have a college diploma. When assigning your order, we match the paper subject with the area of specialization of the writer.
Why choose our academic writing service?
- Plagiarism free papers
- Timely delivery
- Any deadline
- Skilled, Experienced Native English Writers
- Subject-relevant academic writer
- Adherence to paper instructions
- Ability to tackle bulk assignments
- Reasonable prices
- 24/7 Customer Support
- Get superb grades consistently
7
PLACE THIS ORDER OR A SIMILAR ORDER WITH GRADE VALLEY TODAY AND GET AN AMAZING DISCOUNT