java - How can I improve this neural net over MNIST? -
i wrote vanilla neural net project, , trained on famous mnist data set. gets 85% accuracy , wondering why lower 95.3% success rate achieved neural net described on mnist website on paper should similar mine.
more details neural net:
-the input 784 greyscale values (0-255) of each pixel in 28x28 image
-it has layer sizes 784-256-10 plus bias node fixed value of 1 connected hidden , output nodes.
-it uses sigmoid activation function
-the error of output node defined difference between it's value , expected value squared.
-the neural network trained via gradient descent across each of 60,000 training cases once per iteration of training algorithm.
-usually converges after 15 iterations accurate classification percentage across test cases of 84%-88%.
-despite fact literature i've read suggests necessary, neural net performs worse when have bias node versus when comment out (i might doing wrong, lmk).
-test , training cases stored objects hold values , expected output in array of cases.
here main operative code of program:
the initialization of structure
/** * method initialize structure of neural net * adds node every input location needed, adds hidden layer * connects hidden layer input layer, makes output layer, , connects output layer * hidden layer */ public void initializestructure() { for(int ii = 0; ii < numinputs; ii++) { inputlayer.add(new node()); } // add node input layer. node has default value of one, // acts bias every node connected (the bias has magnitude of weight) // means thing need edit weights(including these ones), not values node bias = new node(); bias.value = 1.0; inputlayer.add(bias); for(int ii = 0; ii < layersize; ii++) { node currentnode = new node(); for(node n : inputlayer) { currentnode.connections.add(new connection(numinputs,n,currentnode,false)); } hiddenlayer.add(currentnode); } for(int ii = 0; ii < 10; ii++) { node output = new node(); for(node n : hiddenlayer) { output.connections.add(new connection(numinputs,n,output,true)); } output.connections.add(new connection(numinputs,bias,output,true)); outputlayer.add(output); } } the training algorithm
public void train() { for(inputcase current : traincases) { int[] expected = new int[10]; for(int jj = 0; jj < expected.length; jj++) { if(jj == current.expectedoutput) expected[jj] = 1; else expected[jj] = 0; } for(int jj = 0; jj < current.values.length; jj++) { inputlayer.get(jj).value = current.values[jj]; } run(); double sum = 0; for(int ii = 0; ii < outputlayer.size(); ii++) { node n = outputlayer.get(ii); for(connection c : n.connections) { c.weight -= learningrate * c.origin.value * c.destination.value * (1-c.destination.value) * (c.destination.value - expected[ii]); sum += c.destination.value * (1-c.destination.value) * (c.destination.value - expected[ii]) * c.weight; } } for(node n : hiddenlayer) { for(connection c : n.connections) { c.weight -= learningrate * c.origin.value * c.destination.value * (1-c.destination.value) * sum; } } } } if have more questions, or want/need see more code, say!
Comments
Post a Comment