Calculating Error Gradient in a Neural Network

Question

bitRAKE 21 Newbie Poster

16 Years Ago

Hi

I'm working on implementing a neural network, but I'm having trouble calculating the error gradient. The problem is I don't know much about calculus and can't understand what exactly to do.

I found this Web page that explains it quite well, but I still just can't get it.
http://www.willamette.edu/~gorr/classes/cs449/linear2.html

Basically the part I'm trying to implement is the last function in that table.
delta weight = u * (t sub o - y sub o) * y sub i
I know that u is the learning rate, that t is the target, and that yo is the actual output. I don't understand why it's multiplied by yi (which i presume is the input). Is it the input received by the node in question or is it something else?

Any help is greatly appreciated

ps. I posted this in C++ because my implementation is in C++

c++

2 Contributors
2 Replies
163 Views
23 Hours Discussion Span
Latest Post 16 Years Ago Latest Post by bitRAKE

All 2 Replies

r.stiltskin

16 Years Ago

Yes, in [TEX] \Delta w_i = \mu(t_o - y_o)y_i [/TEX], the term [tex]y_i[/tex] is an individual input and [tex]w_i[/tex] is the weight that is applied to that particular input.

The notation in the tutorial that you linked to is confusing because they use [tex]y_o[/tex] to represent the output, and also use [tex]y[/tex] to represent the input vector (with [tex]y_i[/tex] representing each individual input).

You seem to understand the first part of the derivative (equation 4).

To see why the partial derivative in equation (5) equals [tex]y_i[/tex], remember that [TEX]w[/TEX] is a vector of weights, [TEX]y[/TEX] is a vector of inputs and the output [tex]y_o[/tex] is the dot product [TEX]wy[/TEX]. So

[tex]y_o = w_1y_1 + w_2y_2 + ... + w_iy_i + ... + w_ny_n[/tex]

But when you take a partial derivative of [tex]y_o[/tex] with respect to [TEX]w_i[/TEX], all of the other [TEX]w_j[/TEX] terms are constants, and all of the [TEX]y_j[/TEX] terms are constants. The derivative of [tex]w_1y_1[/tex] wrt [tex]w_i[/tex] equals 0, and similarly for all the other terms except for [tex]w_iy_i[/tex], so all that's left is
[tex]\frac{\partial y_o}{\partial w_i} = \frac{\partial}{\partial w_i}(w_iy_i) = y_i[/tex]

Hope that helps.

StuXYZ commented: Well explained +5

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

bitRAKE 21 Newbie Poster · Answer 1 · 2009-02-10T03:40:02+00:00

Thank you so much, that is exactly what I was looking for. Thanks for taking the time to explain it. I'm not very good at readying math jargon, but I understand now.

Calculating Error Gradient in a Neural Network

Recommended Answers Collapse Answers

All 2 Replies

Recommended Answers