As I understand the situation (from one very lame starting exercise) Bias is just one more Input what I call "link" into Perceptron.
The Perceptron takes all the links and multiplies by weights to get a Sum and then that Sum is used in an "Activation Function" (in example just a Sign function that is + 1 if sum >=0 else -1) to determine what the Perceptron will output for that set of links, in the example +1 or -1.
The Bias * BiasWeight was just one more link to add to Sum, you might call it a Constant Link, In the example, the links were x, y coordinates of a point.
Near as I can figure out, Bias is a fix that you need to convert a string label to a number. Since we can't multiply a string (a link label) times a number, we use numbers to represent the links. A coordinate system is very convenient to label links, but the origin 0, 0 would Sum a weight of 0 whether it is important or not, (0, 0) is not trainable. So the Bias is added so that the point (0, 0) can be trained as any other point also (0, y) and (x, 0) might have distorted weights without balancing effect of Bias * BiasWeight.
I am calling the inputs, links, based on drawing of Perceptron: two lines INTO (the x, y coordinates in lame example) one line OUT = Perceptron's evaluation of sum from links. Each link has an importance for deciding the pattern that is the weight the training tries to establish, so you have a value associated with a label of link.
I am calling example lame because yesterday afternoon I started working on problem of digit recognition, so that requires getting digits 0 to 9 trained and saving the weight data for testing against a pattern to identify, the best match wins.
I had some difficulty and confusion getting digit images to fit in 60 x 60 cell array. I thought I messed up something because they were smaller than I thought they would be even though I was multiplying 1 pixel by 3 square cells. I then drew frames around digits (in more +1 -1 cells to train for) to see that I had it right. But with frames around digits, the training couldn't get past 92% for digits 0 and 2 and probably 8, 9... double and triple "enclosures" are hell for array "rays". You really need to assign a weight the every single (x, y) point NOT a weight for x = 1 to cellW, y = 1 to cellH = 60 + 60 + (1 for Bias) weights is how the lame example is doing it.
You need 60 *60 + (1 for Bias) weights for points (not x, y lines) will get the training done and in one pass!
and then you don't even have to train! ;P
So as far as digit recognition goes, I would go about the task in a completely different manner starting with the problem of getting an image you want to ID centered and aligned with your model of the thing you are trying to ID.
Yeah, the pregnancy test was a humorous attempt to overcome this dry and boring example of Y>= X. ;-))