NEURAL NETWORK 2 – Explicitpapers

[ad_1]

ECE/CS 559 Fall 2022 Final
Full Name: ID Number:
Write your name and ID number on everything your return!
Q0 (5 pts): Email confirming instructor evaluations.
Q1 (15 pts): Given vectors u, v+, v−, scalar τ > 0, and some monotonically non-decreasing nonlinearity σ, consider the loss function:
L = − log σ(u
T
v
+/τ ) − log σ(−u
T
v
−/τ ). (1)
(a) (9 pts): Find the gradient of L with respect to u.
(b) (6 pts): Consider the same loss function with the understanding that v
+ are u are positive samples,
while v
− is a negative sample to both v
+ are u. Would L would be a good choice for a contrastive loss
function? Elaborate.
1
Q2 (18 pts): Consider an input distribution X such that P(X = 0) = and P(X = 1) = 1 − .
Choose some noise distribution Z on your own, and consider a GAN designed optimally according to the
usual loss
L = min
G
max
D
EX,Zh
log D(X) + log(1 − D(G(Z))i
, (2)
where D is the discriminator, and G is the generator of the GAN.
(a) (6 pts): Find the discriminator of the optimal GAN.
(b) (6 pts): Find the random variable G(Z) (i.e. the output of the generator) for the optimal GAN.
(c) (6 pts): Find the loss of the optimal GAN.
2
Q3: (12 pts): Given a set of points x1, . . . , xn ∈ R
d
, we recalll the k-means algorithm: First, we
initialize a set of cluster centers µ1, . . . , µk. Then, we alternate between the following two steps: The
clusters are determined as Ck ← {xi
: kxi − µkk ≤ kxi − µjk, j 6= k}, k = 1, . . . , j, where ties are broken
arbitrarily. The cluster centers are then updated as µk ← 1
|Ck|
P
x∈Ck
x.
(a) (6 pts): Write down all steps of the algorithm for x1 = 0, x1 = 2, x3 = 5, x4 = 7 and the initializations
µ1 = 5.5, µ2 = 8 until convergence, including the clusters and the cluster centers you obtain.
(b) (6 pts):Describe the general problem that k-means faces for instead the initialization µ1 = 8, µ2 = 10
(with the same xis) and one modification to k-means to resolve this problem.
3
Q4 (15 pts): Consider the RNN st = φ(u
T xt + v
T
st−1). Here u, v are weight vectors, xt are inputs,
t is the time variable, and φ is some non-linearity. Consider the initial state s0 = 0, and many-to-many
learning with loss function L = (s2 − d2)
2 + (s1 − d1)
2
, where d2 and d1 are some constants. Find the
gradient descent update equations for the weights of the network using backpropagation over time.
4
Q5 (15 pts): Consider the following CNN: Inputs are 10×10 images with 3 color channels. The first
layer has 16 feature maps, each with 3×3 filters, stride=1, and no zero padding. The second layer is 2×2
max pooling with stride of 2. Last layer is a fully connected layer of 15 neurons. Find the number of free
parameters and connections at each layer.
5
Q6 (20 pts): Write your general thoughts or predictions on how AI will take form in the future, how
it will effect our lives (e.g. in the positive or negative directions), or how transformative it will be, or any
other non-technical AI-related issue that is interesting to you in particular. I want a well-thought, detailed
response considering the weight of this question.
6

[ad_2]

NEURAL NETWORK 2

Testimonials