INTRODUCTION
CS112 - Fall 2024
Later this semester, you will create a working neural network in Java, using only your own code. In later classes, you will probably use neural network libraries developed by others to learn about many facets of Machine Learning. But in this class, you will learn that there is no magic in making a neural network— it is something you can build yourself...though the fact that neural networks perform so well does seem like magic.
What is a neuron?
A neuron is a nerve cell or brain cell. They are found in any animal with a brain or some approximation of a brain. (This is almost every type of multicellular animal: people, insects, fish, even jellyfish...but not sea sponges!)
Nerve cells have multiple inputs and a single output. The inputs come from other nerve cells or from "sensors" such as the eye's retina or the ear, and the outputs go to other nerve cells or to "actuators" such as muscles or organs.
Ok, What is a Neuron in a Computer?
Researchers were intrigued by the ability of large networks of animal neurons – that is, "brains" – to store information, make decisions, and learn. They began experimenting with simple computer functions that mimicked the understood bioelectrical operation of neurons, and they got surprisingly good results on a variety of different tasks.
A popular computer function, the "perceptron", was introduced by Frank Rosenblatt in 1957. The basic operation of a perceptron is:
wi are a set of weights
inputi are the inputs to the perceptron
b is an additive bias
activation() is an "activation function", a nonlinear function
Acting alone, a perceptron cannot do much. But if we connect many of them together, with the outputs from several feeding the inputs of another, we get a network that can be used to solve a wide variety of decision-making problems.
What Kind of Problems?
In Paul's last project before leaving the video industry, he learned about neural networks and used them to answer the following question: "If a TV operator wants to process N video channels, and each video
channel has a spatial resolution of X pixels wide by Y pixels high, and some encoded bit-rate of B megabits per second, how many computer servers are needed to process the videos without overloading?"1 The solution was a neural network that took in the resolutions and bit-rates as inputs and returned a number of servers. This solution saved Paul's employer over $1M per year in cloud computing costs.
And of course, Large Language Models such as ChatGPT use Neural Networks to answer questions.
What is "Training"?
The output of a neuron of course depends on the values of the weights and bias (and the activation function). The output of a neural network—a network of neurons—depends on the weights and biases of all of the neurons in a network.
"Training" is a computational process that takes a large set of inputs, and corresponding known outputs, and adjusts every neuron's weights and bias, so that the output of the neural network gets closer and closer to the known output for every input. When training is complete, a new input can be fed to the neural network, even if the input was not in the set used for training, and it should produce a correct output.
How exactly does training work? It's complicated—there are whole classes on it!
Why do neural networks work? This is not really understood--just like we don't understand how brains work, at any large scale. There are plenty of alternative computational models for making decisions, but this model seems to work quite well.
This Week's Lab
For this week, you will build and test a Neuron. In a file RELUNeuron.java please write a class RELUNeuron. For the activation function, use the "RELU function": 2
double activation(double x) {
x /= 20.0;
return x > 0 ? x : 0.0;
}
For class RELUNeuron:
- The constructor takes in the number of inputs for the Neuron, and initializes all weights and bias values to a random value between -1.0 and +1.0.
- an output() method takes in a double[] array of inputs and calculates the proper output value
- a write() method to write the neuron's weights and bias to a DataOutputStream (see DataOutputStream's writeDouble() method)
- a read() method to read the neuron's weights and bias from a DataInputStream (see readDouble() method).
You'll need several class variables, of course. Weights, bias, and maybe more.
To train your Neuron, I will give you a bunch of training data files, each of which contains 501 double values each. These values are not saved as text—they are saved as raw binary double values (I used a DataOutputStream). For each file:
the first 500 values are inputs to your Neuron,
the last value is the expected output from your Neuron.
In class, we will discuss the details of how to train your Neuron. Can you improve this basic training recipe? To reach a smaller error faster?
After you train your Neuron on the provided training data, please save your resulting weights and bias (using your write() method) to a file called weights.dbl .
A critical part of this week's lab is for you to design, execute, and document a set of tests for your Neuron. For this week, please write another Java file TestNeuron.java. This class of course tests your Neuron. You should think about how to do this!
You must test all of your Neuron's methods, to make sure they work properly. o Howdoyoutesttheconstructor?
o Howdoyoutestwrite()andread()?
o Howtotestoutput()?o Howtotesttrain()?
You must write up a 1-2 page document describing how you tested, how your TestNeuron
class works, and your test results. (You do not need to talk about your Neuron class.)
Conclusion
In this project you learned how to build and test a computer "neuron" using nothing but ordinary arithmetic and Boolean logic. In a few weeks we will build, train, and test a real Neural Network.
Rubric
▪ 0-20 points for code quality and proper operation and training of RELUNeuron.java ▪ 10 points if you weights.dbl gives a reasonable error with test files
▪ 0-20 points for your TestNeuron.java
▪ 0-20 points on your test writeup
Further Reading
"How LLM's work." LINK