Final Exam Solution

Q1- Consider a simple neural network with k hidden layer. Further assume that input layer and all hidden layers contain n neurons each, and output layer has a single neuron. Prove that if none of the neurons contain
any nonlinear activation function (e.g. sigmoid etc.) then this neural network can actually be represented without any hidden layers at all. You may assume that n = k = 3 for your proof.

Ans- Output of neural network is given by


y = W_4 W_3 W_2 W_1 X

where X is the input vector and W_i are weight matrix of edges from i layer to i + 1 layer. Note that W_4 is just a row vector. Let W be the matrix product W_4 W_3 W_2 W_1. Clearly, a neural network without any hidden layers with weight vector equal to W, is equivalent to our network.

Q2- Prove Jensen’s inequality, i.e.,

E [ζ(X)] ≥ ζ(E [X])

for a convex function ζ and a random variable X.

Ans-

Consider a line that is tangent to the graph of ζ function at the point E(X) – let l(X) be the equation of this line. Note that all points of ζ function are above or equal to all points on l. Then,

where 1 holds because all points of ζ are on or above l and 2 holds because of linearity of expectation.

Q3- Prove that for an invertible symmetric matrix A with real eigenvalues \lambda_1,  \lambda_ 2, . . . ,  \lambda_ n the eigenvalues of A^{-1} are given by the set:

{1/ \lambda_1, 1/ \lambda_2, . . . , 1/ \lambda_n}

Assume that these eigenvalues are for the unit norm eigenvectors.

Ans- We know that for any arbitrary eigenvector x_i

Ax_i = \lambda_i x_i

multiplying by the inverse whose existence guaranteed,

A^{-1} Ax_i = \lambda_i A^{-1} x_i

Thus,

A^{-1} x_i = \frac{1}{\lambda_i} x_i

Therefore, reciprocals of all eigenvalues of A are eigenvalues of A^{-1} and vice versa. It is easy to see that there can not be anymore eigenvalues.

Q4- Derive Moment Generating Function of a Poisson random variable.

Ans- Let X be poisson, then

MGF(X) = E(e^{tX}) = \sum e^{tX} e^{-\lambda} \frac{\lambda^{x}}{x!}

Which is,
= E(e^{tX}) = e^{-\lambda} \sum \frac{(e^{t} \lambda)^{x}}{x!}

We have,
MGF(X = E(e^{tX}) = e^{-\lambda} e^{e^{t} \lambda} = e^{\lambda (e^{t} -1)}

Q5-

Let A, B, C be three cabinets containing two coins each. The cabinet A containing 2 gold coins, B containing 2 silver coins, and C contains a silver coin and a gold coin. You open one of the cabinets uniformly at random, and pick one of the coins. If you picked a gold coin then what is the probability that the other coin in that cabinet is gold also.

Ans- Let G_i be the event that i^{th} coin is gold coin then, we need to find P(G2|G1), we have

We know that (G1 ∩ G2) is the event that we pick cabinet A, and P(G1) is 1/2. Therefore,

One thought on “Final Exam Solution

  1. If we pick a gold coin, the sample space automatically becomes 2 instead of 3 (as one cabinet has only silver coins). As it is uniform both event A and event B would have probability 1/2. Is that not correct too?

    Liked by 1 person

Leave a comment