refer to training on mnist, which was my first attempt with my "old" feedforward neural network implementation of simple perceptrons, but now i have a (rather slow, currently) convolutional neural network implementation, so time to see the results:
load the dataset (replace train-data-file
and test-data-file
if youre not me):
(ql:quickload "cl-csv")
(defun parse-mylist (mylist) (let* ((digit (parse-integer (car mylist))) ;; digit is at the beginning of the list (mylist (cdr mylist)) ;; the rest of the list contains the pixels (size (floor (sqrt (length mylist)))) (arr (make-array (list 1 size size)))) (loop for i from 0 below size do (loop for j from 0 below size do (setf (aref arr 0 i j) (/ (parse-integer (elt mylist (+ (* size i) j))) 255)))) ;; (setf (aref arr 0 i j) (parse-integer (elt mylist (+ (* size i) j)))))) (cons arr digit))) (defun load-mnist () (defparameter *mnist-train-data* (cl-csv:read-csv (pathname train-data-file))) (defparameter *mnist-test-data* (cl-csv:read-csv (pathname test-data-file))) ;; use vector, access is O(1) unlike lists (defparameter *mnist-train-data* (map 'vector #'parse-mylist (cdr *mnist-train-data*))) (defparameter *mnist-test-data* (map 'vector #'parse-mylist (cdr *mnist-test-data*))))
note that i had to normalize the pixel value from 1-255 to 0-1, otherwise i couldnt train the network, all the deltas were turning into 0 each image is of size 28x28, we can use the following architecture:
;; input meant to be of size 1x28x28 (defun construct-mnist-network () (defparameter *mnist-network* (make-network :layers (list (make-3d-convolutional-layer-from-dims :dims '(32 1 5 5)) ;; size of image becomes 32x24x24 (make-pooling-layer :rows 2 :cols 2 :pooling-function #'average-pooling-function :unpooling-function #'average-unpooling-function) ;; size beccomes 32x12x12 (make-3d-convolutional-layer-from-dims :dims '(16 32 5 5)) ;; size becomes 16x8x8 (make-pooling-layer :rows 2 :cols 2 :pooling-function #'average-pooling-function :unpooling-function #'average-unpooling-function) ;; size becomes 6x4x4 (make-flatten-layer) ;; flatten it, becomes 6x4x4=96 (make-dense-layer :num-units 30 :prev-layer-num-units 96 :activation-function #'relu :activation-function-derivative #'relu-derivative) (make-dense-layer :num-units 10 :prev-layer-num-units 30 :activation-function #'sigmoid :activation-function-derivative #'sigmoid-derivative)) :learning-rate 0.02)))
example usage:
(construct-mnist-network) ;; might wanna make weights closer to 0 (divide-network-weights *mnist-network* 5) *mnist-network*
#<NETWORK #<3D-CONVOLUTIONAL-LAYER weights: 800, dimensions: (32 1 5 5)> #<POOLING-LAYER rows: 2, columns: 2> #<3D-CONVOLUTIONAL-LAYER weights: 12800, dimensions: (16 32 5 5)> #<POOLING-LAYER rows: 2, columns: 2> #<FLATTEN-LAYER {10185E1AC3}> #<DENSE-LAYER weights: 2880, dimensions: (30 96)> #<DENSE-LAYER weights: 300, dimensions: (10 30)> total network weights: 16780, learning rate: 0.02 {10186DB593}>
after running load-mnist
, we can begin training
because at the time ( ), my training algorithm for cnn's was slow, i wanted to measure the accuracy of the algorithm but didnt want to wait days for training to finish, so i tried training on a single image, the network should overfit and be able to classify the image correctly:
(defun train-on-mnist-single-image () (let ((x (car (elt *mnist-train-data* 0))) (y (make-array '(10))) ;; (nw (make-network ;; :layers ;; (list ;; (make-3d-convolutional-layer-from-dims ;; :dims '(16 1 3 3) ;; :activation-function #'relu ;; :activation-function-derivative #'relu-derivative) ;; (make-flatten-layer) ;; (make-3d-convolutional-layer-from-dims ;; :dims '(10 144) ;; :activation-function #'sigmoid ;; :activation-function-derivative #'sigmoid-derivative)) ;; :learning-rate 0.02))) (nw *mnist-network*)) (setf (aref y (cdr (elt *mnist-train-data* 0))) 1) (format t "~%out layer should be: ~A" y) (print "running 100 epochs:") (loop for i from 0 below 10000 do (network-train nw (list x) (list y)) (format t "~%cost: ~A" (network-test nw (list x) (list y)))) (format t "~%out layer: ~A" (car (car (network-feedforward *mnist-network* (car (elt *MNIST-TRAIN-DATA* 0))))))))
eventually, after alot debugging of my code, simplified network did converge and overfit, so the next task was to train it on the actual dataset and not just a single image
(defun train-on-mnist () (network-train-distributed-cpu *mnist-network* (map 'list (lambda (data-entry) (let ((in-tensor (car data-entry)) (digit (cdr data-entry)) (out-tensor (make-array 10))) (setf (aref out-tensor digit) 1) (cons in-tensor out-tensor))) *mnist-train-data*)))