Laboratory Assignment N. 2.1

Solve the following assignments, whose completion is required to access the oral examination. Send the assignments all-together (once you have completed all the labs, not only this single one) as a compressed folder including one subfolder for each laboratory (e.g. the name of the subfolder should be lab1 for the first laboratory, then lab2, etc..).

The subfolder for this lab should include all the Matlab scripts requested in the assignments below. You can organize the code as you wish, implementing all the helper functions that you need provided that these are included in the subfolder and are appropriately called in the scripts. To check that you have successfully completed the assignments I will only run the requested scripts (no debugging, no function chasing, the scripts should work like a charm when I run them).

Bonus track assignments are meant to be for those who finish early or are interested in deepening their knowledge on the models, but they are not formally required for completing the Lab Assignment.

1) Simple Hebbian Learning

a) Correlation Rule

Write down a Matlab-script hebbA.m implementing Hebb's correlation rule for a single linear neuron receiving input from the data matrix (samples are on the rows, inputs on the columns) in the dataHebb.mat file compressed in this archive. You can use the steady-state equation $v = w u$. Implement a discrete time version of the Hebb correlation rule by

Starting from a weight vector w randomly initialized in $[-1,1]$
For each data sample in the data matrix update the synaptic weights using $w(t+1) = w(t) + \epsilon \frac{dw}{dt}$, where $\epsilon$ is a small positive constant (e.g. $\epsilon = 0.01$) and $\frac{dw}{dt}$ is computed by the Hebb correlation rule (assuming $\tau_w = 1$). I strongly suggest that you keep in mind the pseudo-code in Lecture 8.

To implement this process, at each time step, feed the neuron with an input $u$ from the data matrix. Once you have reached the last data point in matrix data, shuffle (i.e. randomly reorder) the samples in data (e.g. consider using the function randperm()) and start again from the first element of the reordered matrix. Keep iterating this process until the change in $w$ between two consecutive swipes through the whole data is negligible (i.e. the norm of the difference of the new and old vectors is smaller than an arbitrary small positive threshold).

After training has converged, plot a figure displaying (on the same graph) the training data points (points in the bidimensional space), the final weight vector $w$ resulting from the learning process and the first principal component of the zero-mean input correlation matrix (e.g. subtract the mean of the population from the data matrix, compute the correlation matrix, apply the eig() function and find the eigenvector associated with the maximum-eigenvalue; or use function princomp()).

Generate two figures plotting the evolution in time of the two components of the weight vector $w$ (for this you will need to keep track of $w(t)$ evolution during training). The plot will have time on the $x$ axis and the weight value on the $y$ axis (provide a separate plot for each component of the weight vector). Also provide another plot of the evolution in time of the norm of the weight vector during learning.

b) Oja Rule

Write down a Matlab script hebbB.m that, first, generates a new version of the dataset in previous exercise by subtracting the mean of the data matrix (i.e. you population becomes zero centered). Then repeat the script implemented in previous exercise, this time using the Oja rule in place of the correlation rule to perform learning (you can assume $\alpha=1$ but I suggest that you play a little bit with this value to see how the behaviour of the algorithm changes).

Generate the same plots as in the previous exercise.

Bonus Track

Repeat the experiments above this time using the BMC rule.

2) Hopfield Networks

a) Synthetic Data

Write down a Matlab-script hopfieldA.m implementing an asynchronous binary Hopfield network model that stores the following 3 synthetic memories: \[p1 = [{-1} \ {-1} \ {+1} \ {-1} \ {+1} \ {-1} \ {-1} \ {+1}];\] \[p2 = [{-1} \ {-1} \ {-1} \ {-1} \ {-1} \ {+1} \ {-1} \ {-1}];\] \[p3 = [{-1} \ {+1} \ {+1} \ {-1} \ {-1} \ {+1} \ {-1} \ {+1}];\] Train the model using the (covariance/correlation) Hebbian rule assuming no bias weight. Check if the network has effectively stored the 3 patterns as fixed points (how do you do this?): in doing so, compute a measure of match between the retrieved patterns and the input stimuli (up to you to decide a suitable one) and print it to the Matlab console (or plot it if you prefer).

Now consider the following distorted inputs: $$p1d = [{+1} \ {-1} \ {+1} \ {-1} \ {+1} \ {-1} \ {-1} \ {+1}];$$ $$p2d = [{+1} \ {+1} \ {-1} \ {-1} \ {-1} \ {+1} \ {-1} \ {-1}];$$ $$p3d = [{+1} \ {+1} \ {+1} \ {-1} \ {+1} \ {+1} \ {-1} \ {+1}];$$ Feed them to the network and apply the asynchronous activation update until convergence.

Did all the patterns converge to the appropriate stored memory? Assess the discrepancy using the measure defined above and print it for each pattern on the Matlab console (or plot it).

Store the activations of the network neurons (as a function of time) in an array. Once the network has converged, plot such activations (as a function of time) on the same figure (cf. hold on command in Matlab).

b) Image Dataset

Download and decompress this archive file, containing a Matlab workspace digits.mat and the helper function distort_image().

Load the digit.mat file: this contains a cell-array named dataset containing 10 cells each being a 32×32 binary image representing a single digit (e.g. dataset{1} contains an image of the zero digit, dataset{2} contains an image of digit 1, etc).

A couple of hints to play with an image matrix img:

You can visualize an image with command imagesc(img);
You can transform a (32×32) image matrix into the (1024) vector by vec = img(:);
You can reshape a 1024 vector to a 32×32 matrix by img = reshape(vec,32,32);
The helper function img1 = distort_image(img, prop) distorts the image matrix supplied as first parameter by flipping a number of pixels proportional to the number provided as second argument (should be a number in $[0,1]$). E.g. img1 = distort_image(img, 0.05) flips $5\%$ of the pixels of the original matrix.

Testing reconstruction

Write down a Matlab script hopfieldB.m that learns the first three digit patterns in the dataset (i.e. dataset{1}, dataset{2}, dataset{3}). Use again an asynchronous binary Hopfield model with no bias.

Generate distorted versions of the 3 patterns using the helper function three different proportions of distorted pixels, i.e. 0.05, 0.1, 0.3.

Feed the 9 images that you have generated to the trained network and for each distorted image:

Plot the the energy of the network as a function of time.
Plot the overlap of the distorted image with the 3 memories as a function of time (all the 3 overlap measures should be on the same figure, i.e. use hold on)
Plot the reconstructed image and compute a measure of discrepancy between the reconstructed image and the optimal memory (add this result as title caption to the figure, e.g. with command title).

Testing memory capacity

Write down a Matlab script hopfieldC.m that learns the first three digit patterns in the dataset (i.e. dataset{1}, dataset{2}, dataset{3}). Generate distorted versions of the 3 patterns using a proportion of 0.05 distorted pixels.

Incrementally add new memories to the network, starting with dataset{4}, then dataset{5}, etc. Each time you add a new memory

Generate also one corresponding distorted pattern (again with proportion 0.05)
Perform a test on all the distorted patterns generated so far by feeding them to the network and and computing the discrepancy between the reconstructed image and the optimal memory.

Plot the average discrepancy (i.e. averaged with respect to the number of patterns) as a function of the number of stored memories: is there a drop in reconstruction performance at some point? Is it gradual or abrupt?

Bonus Track

Extend the scripts hopfieldB.m and hopfieldC.m by introducing a bias for the units (hint: a bias is equivalent to an input that is constantly firing).