76
Backpropagation: Understanding How to Update ANNs Weights Step-by-Step Ahmed Fawzy Gad [email protected] MENOUFIA UNIVERSITY FACULTY OF COMPUTERS AND INFORMATION INFORMATION TECHNOLOGY معة المنوفية جامعلوماتت واللحاسبا كلية امعلومات ال تكنولوجيامعة المنوفية جا

Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

Embed Size (px)

Citation preview

Page 1: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

Ahmed Fawzy Gad

[email protected]

MENOUFIA UNIVERSITYFACULTY OF COMPUTERS AND INFORMATION

INFORMATION TECHNOLOGY

جامعة المنوفية

كلية الحاسبات والمعلومات

تكنولوجيا المعلومات

جامعة المنوفية

Page 2: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

Train then Update

• The backpropagation algorithm is used to update the NN weightswhen they are not able to make the correct predictions. Hence, weshould train the NN before applying backpropagation.

Initial Weights PredictionTraining

Page 3: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

Train then Update

• The backpropagation algorithm is used to update the NN weightswhen they are not able to make the correct predictions. Hence, weshould train the NN before applying backpropagation.

Initial Weights PredictionTraining

BackpropagationUpdate

Page 4: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

Neural Network Training Example

𝐗𝟏 𝐗𝟐 𝐎𝐮𝐭𝐩𝐮𝐭

𝟎. 𝟏 𝟎. 𝟑 𝟎. 𝟎𝟑

𝐖𝟏 𝐖𝟐 𝐛

𝟎. 𝟓 𝟎. 𝟓 1. 𝟖𝟑

Training Data Initial Weights

𝟎. 𝟏

In Out

𝑾𝟏 = 𝟎. 𝟓

𝑾𝟐 = 𝟎. 𝟐

+𝟏

𝒃 = 𝟏. 𝟖𝟑

𝟎. 𝟑

𝑿𝟏

In Out

𝑾𝟏

𝑾𝟐

+𝟏

𝒃

𝑿𝟐

Page 5: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

Network Training

• Steps to train our network:1. Prepare activation function input

(sum of products between inputsand weights).

2. Activation function output.

𝟎. 𝟏

In Out

𝑾𝟏 = 𝟎. 𝟓

𝑾𝟐 = 𝟎. 𝟐

+𝟏

𝒃 = 𝟏. 𝟖𝟑

𝟎. 𝟑

Page 6: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

Network Training: Sum of Products

• After calculating the sop between inputsand weights, next is to use this sop as theinput to the activation function.

𝟎. 𝟏

In Out

𝑾𝟏 = 𝟎. 𝟓

𝑾𝟐 = 𝟎. 𝟐

+𝟏

𝒃 = 𝟏. 𝟖𝟑

𝟎. 𝟑

𝒔 = 𝑿1 ∗ 𝑾1 + 𝑿2 ∗ 𝑾2 + 𝒃

𝒔 = 𝟎. 𝟏 ∗ 𝟎. 𝟓 + 𝟎. 𝟑 ∗ 𝟎. 𝟐 + 𝟏. 𝟖𝟑

𝒔 = 𝟏. 𝟗𝟒

Page 7: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

Network Training: Activation Function

• In this example, the sigmoid activationfunction is used.

• Based on the sop calculated previously,the output is as follows:

𝟎. 𝟏

In Out

𝑾𝟏 = 𝟎. 𝟓

𝑾𝟐 = 𝟎. 𝟐

+𝟏

𝒃 = 𝟏. 𝟖𝟑

𝟎. 𝟑

𝒇 𝒔 =𝟏

𝟏 + 𝒆−𝒔

𝒇 𝒔 =𝟏

𝟏 + 𝒆−𝟏.𝟗𝟒=

𝟏

𝟏 + 𝟎. 𝟏𝟒𝟒=

𝟏

𝟏. 𝟏𝟒𝟒

𝒇 𝒔 = 𝟎. 𝟖𝟕𝟒

Page 8: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

Network Training: Prediction Error

• After getting the predicted outputs,next is to measure the prediction errorof the network.

• We can use the squared error functiondefined as follows:

• Based on the predicted output, theprediction error is:

𝟎. 𝟏

In Out

𝑾𝟏 = 𝟎. 𝟓

𝑾𝟐 = 𝟎. 𝟐

+𝟏

𝒃 = 𝟏. 𝟖𝟑

𝟎. 𝟑

𝑬 =𝟏

𝟐𝒅𝒆𝒔𝒊𝒓𝒆𝒅 − 𝒑𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 𝟐

𝑬 =𝟏

𝟐𝟎. 𝟎𝟑 − 𝟎. 𝟖𝟕𝟒 𝟐 =

𝟏

𝟐−𝟎. 𝟖𝟒𝟒 𝟐 =

𝟏

𝟐𝟎. 𝟕𝟏𝟑 = 𝟎. 𝟑𝟓𝟕

Page 9: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

How to Minimize Prediction Error?

• There is a prediction error and it should be minimized until reachingan acceptable error.

What should we do in order to minimize the error?• There must be something to change in order to minimize the error. In

our example, the only parameter to change is the weight.

How to update the weights?• We can use the weights update equation:

𝑾𝒏𝒆𝒘 = 𝑾𝒐𝒍𝒅 + η 𝒅 − 𝒀 𝑿

Page 10: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

Weights Update Equation

• We can use the weights update equation:

𝑾𝒏𝒆𝒘: new updated weights.

𝑾𝒐𝒍𝒅: current weights. [1.83, 0.5, 0.2]

η: network learning rate. 0.01

𝒅: desired output. 0.03

𝒀: predicted output. 0.874

𝑿: current input at which the network made false prediction. [+1, 0.1, 0.3]

𝑾𝒏𝒆𝒘 = 𝑾𝒐𝒍𝒅 + η 𝒅 − 𝒀 𝑿

Page 11: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

Weights Update Equation𝑾𝒏𝒆𝒘 = 𝑾𝒐𝒍𝒅 + η 𝒅 − 𝒀 𝑿

= [𝟏. 𝟖𝟑, 𝟎. 𝟓, 𝟎. 𝟐 + 𝟎. 𝟎𝟏 𝟎. 𝟎𝟑 − 𝟎. 𝟖𝟕𝟒 [+𝟏, 𝟎. 𝟏, 𝟎. 𝟑

= [𝟏. 𝟖𝟑, 𝟎. 𝟓, 𝟎. 𝟐 + −𝟎. 𝟎𝟎𝟖𝟒[+𝟏, 𝟎. 𝟏, 𝟎. 𝟑

= [𝟏. 𝟖𝟑, 𝟎. 𝟓, 𝟎. 𝟐 + [−𝟎. 𝟎𝟎𝟖𝟒,−𝟎. 𝟎𝟎𝟎𝟖𝟒,−𝟎. 𝟎𝟎𝟐𝟓

= [𝟏. 𝟖𝟐𝟐, 𝟎. 𝟒𝟗𝟗, 𝟎. 𝟏𝟗𝟖

Page 12: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

Weights Update Equation

• The new weights are:

• Based on the new weights, the network will be re-trained.

𝑾𝟏𝒏𝒆𝒘 𝑾𝟐𝒏𝒆𝒘 𝒃𝒏𝒆𝒘

𝟎. 𝟏𝟗𝟖 𝟎. 𝟒𝟗𝟗 𝟏. 𝟖𝟐𝟐

𝟎. 𝟏

In Out

𝑾𝟏 = 𝟎. 𝟓

𝑾𝟐 = 𝟎. 𝟐

+𝟏

𝒃 = 𝟏. 𝟖𝟑

𝟎. 𝟑

Page 13: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

Weights Update Equation

• The new weights are:

• Based on the new weights, the network will be re-trained.

• Continue these operations until prediction error reaches anacceptable value.

1. Updating weights.2. Retraining network.3. Calculating prediction error.

𝑾𝟏𝒏𝒆𝒘 𝑾𝟐𝒏𝒆𝒘 𝒃𝒏𝒆𝒘

𝟎. 𝟏𝟗𝟖 𝟎. 𝟒𝟗𝟗 𝟏. 𝟖𝟐𝟐

𝟎. 𝟏

In Out

𝑾𝟏 = 𝟎. 𝟒𝟗𝟗

𝑾𝟐 = 𝟎. 𝟏𝟗𝟖

+𝟏

𝒃 = 𝟏. 𝟖22

𝟎. 𝟑

Page 14: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

Why Backpropagation Algorithm is Important?

• The backpropagation algorithm is used to answer these questionsand understand effect of each weight over the prediction error.

New Weights!Old Weights

Page 15: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

Forward Vs. Backward Passes

• When training a neural network, there are twopasses: forward and backward.

• The goal of the backward pass is to know how eachweight affects the total error. In other words, howchanging the weights changes the prediction error?

Forward

Backward

Page 16: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

Backward Pass

• Let us work with a simpler example:

• How to answer this question: What is the effect on the output Ygiven a change in variable X?

• This question is answered using derivatives. Derivative of Y wrt X (𝝏𝒀

𝝏𝑿)

will tell us the effect of changing the variable X over the output Y.

𝒀 = 𝑿𝟐𝒁 + 𝑯

Page 17: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

Calculating Derivatives

• The derivative𝝏𝒀

𝝏𝑿can be calculated as follows:

• Based on these two derivative rules:

• The result will be:

𝝏𝒀

𝛛𝑿=

𝛛

𝛛𝑿(𝑿𝟐𝒁 + 𝑯)

𝒀 = 𝑿𝟐𝒁 + 𝑯

𝛛

𝛛𝑿𝑿𝟐 = 𝟐𝑿Square

𝛛

𝛛𝑿𝑪 = 𝟎Constant

𝝏𝒀

𝛛𝑿= 𝟐𝑿𝒁 + 𝟎 = 𝟐𝑿𝒁

Page 18: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

Prediction Error – Weight Derivative

E W?

𝑬 =𝟏

𝟐𝒅𝒆𝒔𝒊𝒓𝒆𝒅 − 𝒑𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 𝟐

Change in Y wrt X𝝏𝒀

𝛛𝑿Change in E wrt W

𝝏𝑬

𝛛𝑾

Page 19: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

Prediction Error – Weight Derivative

𝑬 =𝟏

𝟐𝒅𝒆𝒔𝒊𝒓𝒆𝒅 − 𝒑𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 𝟐

Page 20: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

Prediction Error – Weight Derivative

𝑬 =𝟏

𝟐𝒅𝒆𝒔𝒊𝒓𝒆𝒅 − 𝒑𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 𝟐

Page 21: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

Prediction Error – Weight Derivative

𝑬 =𝟏

𝟐𝒅𝒆𝒔𝒊𝒓𝒆𝒅 − 𝒑𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 𝟐

𝒅𝒆𝒔𝒊𝒓𝒆𝒅 = 𝟎. 𝟎𝟑 (𝑪𝒐𝒏𝒔𝒕𝒂𝒏𝒕)

Page 22: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

Prediction Error – Weight Derivative

𝑬 =𝟏

𝟐𝒅𝒆𝒔𝒊𝒓𝒆𝒅 − 𝒑𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 𝟐

𝒅𝒆𝒔𝒊𝒓𝒆𝒅 = 𝟎. 𝟎𝟑 (𝑪𝒐𝒏𝒔𝒕𝒂𝒏𝒕) 𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 = 𝒇 𝒔 =𝟏

𝟏 + 𝒆−𝒔

Page 23: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

Prediction Error – Weight Derivative

𝑬 =𝟏

𝟐𝒅𝒆𝒔𝒊𝒓𝒆𝒅 −

𝟏

𝟏 + 𝒆−𝒔

𝟐

𝑬 =𝟏

𝟐𝒅𝒆𝒔𝒊𝒓𝒆𝒅 − 𝒑𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 𝟐

𝒅𝒆𝒔𝒊𝒓𝒆𝒅 = 𝟎. 𝟎𝟑 (𝑪𝒐𝒏𝒔𝒕𝒂𝒏𝒕) 𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 = 𝒇 𝒔 =𝟏

𝟏 + 𝒆−𝒔

Page 24: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

Prediction Error – Weight Derivative

𝑬 =𝟏

𝟐𝒅𝒆𝒔𝒊𝒓𝒆𝒅 −

𝟏

𝟏 + 𝒆−𝒔

𝟐

𝑬 =𝟏

𝟐𝒅𝒆𝒔𝒊𝒓𝒆𝒅 − 𝒑𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 𝟐

𝒅𝒆𝒔𝒊𝒓𝒆𝒅 = 𝟎. 𝟎𝟑 (𝑪𝒐𝒏𝒔𝒕𝒂𝒏𝒕) 𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 = 𝒇 𝒔 =𝟏

𝟏 + 𝒆−𝒔

Page 25: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

Prediction Error – Weight Derivative

𝑬 =𝟏

𝟐𝒅𝒆𝒔𝒊𝒓𝒆𝒅 −

𝟏

𝟏 + 𝒆−𝒔

𝟐

𝒔 = 𝑿1 ∗ 𝑾1 + 𝑿2 ∗ 𝑾2 + 𝒃

𝑬 =𝟏

𝟐𝒅𝒆𝒔𝒊𝒓𝒆𝒅 − 𝒑𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 𝟐

𝒅𝒆𝒔𝒊𝒓𝒆𝒅 = 𝟎. 𝟎𝟑 (𝑪𝒐𝒏𝒔𝒕𝒂𝒏𝒕) 𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 = 𝒇 𝒔 =𝟏

𝟏 + 𝒆−𝒔

Page 26: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

Prediction Error – Weight Derivative

𝑬 =𝟏

𝟐𝒅𝒆𝒔𝒊𝒓𝒆𝒅 − 𝒑𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 𝟐

𝑬 =𝟏

𝟐𝒅𝒆𝒔𝒊𝒓𝒆𝒅 −

𝟏

𝟏 + 𝒆−𝒔

𝟐

𝒅𝒆𝒔𝒊𝒓𝒆𝒅 = 𝟎. 𝟎𝟑 (𝑪𝒐𝒏𝒔𝒕𝒂𝒏𝒕) 𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 = 𝒇 𝒔 =𝟏

𝟏 + 𝒆−𝒔

𝒔 = 𝑿1 ∗ 𝑾1 + 𝑿2 ∗ 𝑾2 + 𝒃

𝑬 =𝟏

𝟐𝒅𝒆𝒔𝒊𝒓𝒆𝒅 −

𝟏

𝟏 + 𝒆−(𝑿1∗ 𝑾1+ 𝑿2∗𝑾2+𝒃)

𝟐

Page 27: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

Multivariate Chain Rule

Predicted Output

Prediction Error

sop Weights

𝑬 =𝟏

𝟐𝒅𝒆𝒔𝒊𝒓𝒆𝒅 − 𝒑𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 𝟐 𝒇 𝒙 =

𝟏

𝟏 + 𝒆−𝒔𝒔 = 𝑿𝟏 ∗ 𝑾𝟏 + 𝑿𝟐 ∗ 𝑾𝟐 + 𝒃 𝑾𝟏,𝑾𝟐

𝑬 =𝟏

𝟐𝒅𝒆𝒔𝒊𝒓𝒆𝒅 −

𝟏

𝟏 + 𝒆−(𝑿1∗ 𝑾1+ 𝑿2∗𝑾2+𝒃)

𝟐

𝝏𝑬

𝝏𝑾=

𝝏

𝝏𝑾(𝟏

𝟐𝒅𝒆𝒔𝒊𝒓𝒆𝒅 −

𝟏

𝟏 + 𝒆−(𝑿𝟏∗ 𝑾𝟏+ 𝑿𝟐∗𝑾𝟐+𝒃)

𝟐

)

Chain Rule

Page 28: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

Multivariate Chain RulePredicted

OutputPrediction

Errorsop Weights

𝑬 =𝟏

𝟐𝒅𝒆𝒔𝒊𝒓𝒆𝒅 − 𝒑𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 𝟐 𝒇 𝒙 =

𝟏

𝟏 + 𝒆−𝒔𝒔 = 𝑿𝟏 ∗ 𝑾𝟏 + 𝑿𝟐 ∗ 𝑾𝟐 + 𝒃 𝑾𝟏,𝑾𝟐

𝝏𝑬

𝛛𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅

𝝏𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅

𝛛𝒔

𝝏𝒔

𝛛𝑾𝟏

𝝏𝒔

𝛛𝑾𝟐

𝝏𝑬

𝛛𝑾𝟏

𝝏𝑬

𝛛𝑾𝟐

Let’s calculate these individual partial derivatives.

𝝏𝑬

𝝏𝑾𝟏=

𝝏𝑬

𝝏𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅∗

𝝏𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅

𝝏𝒔∗

𝝏𝒔

𝝏𝑾𝟏

𝝏𝑬

𝝏𝑾𝟐=

𝝏𝑬

𝝏𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅∗

𝝏𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅

𝝏𝒔∗

𝝏𝒔

𝝏𝑾𝟐

𝝏𝑬

𝝏𝑾𝟐=

𝝏𝑬

𝝏𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅∗

𝝏𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅

𝝏𝒔∗

𝝏𝒔

𝝏𝑾𝟐

Page 29: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

Error-Predicted (𝝏𝑬

𝛛𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅) Partial Derivative

Substitution

𝝏𝑬

𝝏𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅=

𝝏

𝝏𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅(𝟏

𝟐𝒅𝒆𝒔𝒊𝒓𝒆𝒅 − 𝒑𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 𝟐)

= 𝟐 ∗𝟏

𝟐𝒅𝒆𝒔𝒊𝒓𝒆𝒅 − 𝒑𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 𝟐−𝟏 ∗ (𝟎 − 𝟏)

)= (𝒅𝒆𝒔𝒊𝒓𝒆𝒅 − 𝒑𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅) ∗ (−𝟏

= 𝒑𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 − 𝒅𝒆𝒔𝒊𝒓𝒆𝒅

𝝏𝑬

𝛛𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅= 𝒑𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 − 𝒅𝒆𝒔𝒊𝒓𝒆𝒅 = 𝟎. 𝟖𝟕𝟒 − 𝟎. 𝟎𝟑

𝝏𝑬

𝛛𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅= 𝟎. 𝟖𝟒𝟒

𝑬 =𝟏

𝟐𝒅𝒆𝒔𝒊𝒓𝒆𝒅 − 𝒑𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 𝟐

Page 30: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

Predicted-sop (𝝏𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅

𝝏𝒔) Partial Derivative

𝝏𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅

𝝏𝒔=

𝝏

𝝏𝒔(

𝟏

𝟏 + 𝒆−𝒔)

𝝏𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅

𝝏𝒔=

𝟏

𝟏 + 𝒆−𝒔(𝟏 −

𝟏

𝟏 + 𝒆−𝒔)

𝝏𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅

𝝏𝒔=

𝟏

𝟏 + 𝒆−𝒔(𝟏 −

𝟏

𝟏 + 𝒆−𝒔) =

𝟏

𝟏 + 𝒆−𝟏.𝟗𝟒(𝟏 −

𝟏

𝟏 + 𝒆−𝟏.𝟗𝟒)

=𝟏

𝟏 + 𝟎. 𝟏𝟒𝟒(𝟏 −

𝟏

𝟏 + 𝟎. 𝟏𝟒𝟒)

=𝟏

𝟏. 𝟏𝟒𝟒(𝟏 −

𝟏

𝟏. 𝟏𝟒𝟒)

= 𝟎. 𝟖𝟕𝟒(𝟏 − 𝟎. 𝟖𝟕𝟒)

= 𝟎. 𝟖𝟕𝟒(𝟎. 𝟏𝟐𝟔)

𝝏𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅

𝛛𝒔= 𝟎. 𝟏𝟏

Substitution

𝐏𝐫𝐞𝐝𝐢𝐜𝐭𝐞𝐝 =𝟏

𝟏 + 𝒆−𝒔

Page 31: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

Sop-𝑊1 (𝝏𝒔

𝛛𝑾𝟏) Partial Derivative

𝝏𝒔

𝛛𝑾𝟏=

𝛛

𝛛𝑾𝟏(𝑿𝟏 ∗ 𝑾𝟏 + 𝑿𝟐 ∗ 𝑾𝟐 + 𝒃)

= 𝟏 ∗ 𝑿𝟏 ∗ 𝑾𝟏𝟏−𝟏 + 𝟎 + 𝟎

= 𝑿𝟏 ∗ 𝑾𝟏𝟎

)= 𝑿𝟏(𝟏𝝏𝒔

𝛛𝑾𝟏= 𝑿𝟏

𝝏𝒔

𝛛𝑾𝟏= 𝑿𝟏

Substitution

𝝏𝒔

𝛛𝑾𝟏= 𝟎. 𝟏

𝐬 = 𝑿1 ∗ 𝑾1 + 𝑿2 ∗ 𝑾2 + 𝒃

Page 32: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

𝝏𝒔

𝛛𝑾𝟐=

𝛛

𝛛𝑾𝟐(𝑿𝟏 ∗ 𝑾𝟏 + 𝑿𝟐 ∗ 𝑾𝟐 + 𝒃)

= 𝟎 + 𝟏 ∗ 𝑿𝟐 ∗ 𝑾𝟐𝟏−𝟏 + 𝟎

= 𝑿𝟐 ∗ 𝑾𝟐𝟎

)= 𝑿𝟐(𝟏𝝏𝒔

𝛛𝑾𝟐= 𝑿𝟐

𝝏𝒔

𝛛𝑾𝟐= 𝑿𝟐 = 𝟎. 𝟑

Substitution

𝝏𝒔

𝛛𝑾𝟐= 𝟎. 𝟑

𝐬 = 𝑿1 ∗ 𝑾1 + 𝑿2 ∗ 𝑾2 + 𝒃

Sop-𝑊1 (𝝏𝒔

𝛛𝑾𝟐) Partial Derivative

Page 33: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

Error-𝑊1 (𝛛𝑬

𝛛𝑾𝟏) Partial Derivative

• After calculating each individual derivative, we can multiply all ofthem to get the desired relationship between the prediction errorand each weight.

𝝏𝑬

𝝏𝑾𝟏=

𝝏𝑬

𝝏𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅∗

𝝏𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅

𝝏𝒔∗

𝝏𝒔

𝝏𝑾𝟏𝝏𝑬

𝛛𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅= 𝟎. 𝟖𝟒𝟒

𝝏𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅

𝛛𝒔= 𝟎. 𝟏𝟏

𝝏𝒔

𝛛𝑾𝟏= 𝟎. 𝟏

𝝏𝑬

𝛛𝑾𝟏= 𝟎. 𝟖𝟒𝟒 ∗ 𝟎. 𝟏𝟏 ∗ 𝟎. 𝟏

𝝏𝑬

𝛛𝑾𝟏= 𝟎. 𝟎𝟏

Calculated Derivatives

Page 34: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

Error-𝑊2 (𝛛𝑬

𝛛𝑾𝟐) Partial Derivative

𝝏𝑬

𝝏𝑾𝟐=

𝝏𝑬

𝝏𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅∗

𝝏𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅

𝝏𝒔∗

𝝏𝒔

𝝏𝑾𝟐𝝏𝑬

𝛛𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅= 𝟎. 𝟖𝟒𝟒

𝝏𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅

𝛛𝒔= 𝟎. 𝟏𝟏

𝝏𝒔

𝛛𝑾𝟐= 𝟎. 𝟑

𝛛𝑬

𝛛𝑾𝟐= 𝟎. 𝟎𝟑

𝝏𝑬

𝛛𝑾𝟐= 𝟎. 𝟖𝟒𝟒 ∗ 𝟎. 𝟏𝟏 ∗ 𝟎. 𝟑

Calculated Derivatives

Page 35: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

Interpreting Derivatives

• There are two useful pieces of information from the derivativescalculated previously.

Increasing/decreasing weight increases/decreases error.

Derivative MagnitudeDerivative Sign

Positive

Increasing/decreasing weight decreases/increases error.

Negative

Increasing/decreasing weight by P increases/decreases error by MAG*P.

Increasing/decreasing weight by P decreases/increases error by MAG*P.

Positive Sign

Negative Sign

In our example, because both 𝛛𝑬

𝛛𝑾𝟏and

𝛛𝑬

𝛛𝑾𝟐are positive, then we would

like to decrease the weights in order to decrease the prediction error.

𝛛𝑬

𝛛𝑾𝟐= 𝟎. 𝟎𝟑

𝝏𝑬

𝛛𝑾𝟏= 𝟎. 𝟎𝟏

Page 36: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

Updating Weights• Each weight will be updated based on its derivative according to this

equation:

𝑾𝒊𝒏𝒆𝒘 = 𝑾𝒊𝒐𝒍𝒅 − η ∗𝛛𝑬

𝛛𝑾𝒊

𝑾𝟏𝒏𝒆𝒘 = 𝑾𝟏 − η ∗𝛛𝑬

𝛛𝑾𝟏

= 𝟎. 𝟓 − 0.01 ∗ 𝟎. 𝟎𝟏

𝑾𝟏𝒏𝒆𝒘 = 𝟎. 𝟒𝟗𝟗𝟗𝟏

𝑾𝟐𝒏𝒆𝒘 = 𝑾𝟐 − η ∗𝛛𝑬

𝛛𝑾𝟐

= 𝟎. 𝟐 − 0.01 ∗ 𝟎. 𝟎𝟐𝟖

𝑾𝟐𝒏𝒆𝒘 = 𝟎. 𝟏𝟗𝟗𝟕

Updating 𝑾𝟏 Updating 𝑾𝟐

Continue updating weights according to derivatives and re-train the network until reaching an acceptable error.

Page 37: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

Second ExampleBackpropagation for NN with Hidden Layer

Page 38: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

ANN with Hidden Layer

𝑾𝟏 𝑾𝟐 𝑾𝟑 𝑾𝟒 𝑾𝟓 𝑾𝟔 𝒃𝟏 𝒃𝟐 𝒃𝟑

𝟎. 𝟓 𝟎. 𝟏 𝟎. 𝟔𝟐 𝟎. 𝟐 −𝟎. 𝟐 𝟎. 𝟑 𝟎. 𝟒 −𝟎. 𝟏 𝟏. 𝟖𝟑

𝐗𝟏 𝐗𝟐 𝐎𝐮𝐭𝐩𝐮𝐭

𝟎. 𝟏 𝟎. 𝟑 𝟎. 𝟎𝟑

Training Data

Initial Weights

Page 39: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

ANN with Hidden Layer

Initial Weights PredictionTraining

Page 40: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

ANN with Hidden Layer

Initial Weights PredictionTraining

BackpropagationUpdate

Page 41: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

Forward Pass – Hidden Layer Neurons

𝒉𝟏𝒊𝒏 = 𝑿𝟏 ∗ 𝑾𝟏 + 𝑿𝟐 ∗ 𝑾𝟐 + 𝒃𝟏

= 𝟎. 𝟏 ∗ 𝟎. 𝟓 + 𝟎. 𝟑 ∗ 𝟎. 𝟏 + 𝟎. 𝟒

𝒉𝟏𝒊𝒏 = 𝟎. 𝟒𝟖

𝒉𝟏𝒐𝒖𝒕 =𝟏

𝟏 + 𝒆−𝒉𝟏𝒊𝒏

=𝟏

𝟏 + 𝒆−𝟎.𝟒𝟖

𝒉𝟏𝒐𝒖𝒕 = 𝟎. 𝟔𝟏𝟖

𝒉𝟏

In

Out

Page 42: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

Forward Pass – Hidden Layer Neurons

𝒉𝟐𝒊𝒏 = 𝑿𝟏 ∗ 𝑾𝟑 + 𝑿𝟐 ∗ 𝑾𝟒 + 𝒃𝟐

= 𝟎. 𝟏 ∗ 𝟎. 𝟔𝟐 + 𝟎. 𝟑 ∗ 𝟎. 𝟐 − 𝟎. 𝟏

𝒉𝟐𝒊𝒏 = 𝟎. 𝟎𝟐𝟐

𝒉𝟐𝒐𝒖𝒕 =𝟏

𝟏 + 𝒆−𝒉𝟐𝒊𝒏

=𝟏

𝟏 + 𝒆−𝟎.𝟎𝟐𝟐

𝒉𝟐𝒐𝒖𝒕 = 𝟎. 𝟓𝟎𝟔

𝒉𝟐

In

Out

Page 43: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

Forward Pass – Output Layer Neuron

𝒐𝒖𝒕𝒊𝒏 = 𝒉𝟏𝒐𝒖𝒕 ∗ 𝑾𝟓 + 𝒉𝟐𝒐𝒖𝒕 ∗ 𝑾𝟔 + 𝒃𝟑

= 𝟎. 𝟔𝟏𝟖 ∗ −𝟎. 𝟐 + 𝟎. 𝟓𝟎𝟔 ∗ 𝟎. 𝟑 + 𝟏. 𝟖𝟑

𝒐𝒖𝒕𝒊𝒏 = 𝟏. 𝟖𝟓𝟖

𝒐𝒖𝒕𝒐𝒖𝒕 =𝟏

𝟏 + 𝒆−𝒐𝒖𝒕𝒊𝒏

=𝟏

𝟏 + 𝒆−𝟏.𝟖𝟓𝟖

𝒐𝒖𝒕𝒐𝒖𝒕 = 𝟎. 𝟖𝟔𝟓

𝒐𝒖𝒕

In

Out

Page 44: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

Forward Pass – Prediction Error

𝒅𝒆𝒔𝒊𝒓𝒆𝒅 = 𝟎. 𝟎𝟑

𝑬 =𝟏

𝟐𝒅𝒆𝒔𝒊𝒓𝒆𝒅 − 𝒐𝒖𝒕𝒐𝒖𝒕

𝟐

=𝟏

𝟐𝟎. 𝟎𝟑 − 𝟎. 𝟖𝟔𝟓 𝟐

𝑬 = 𝟎. 𝟑𝟒𝟗

𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 = 𝒐𝒖𝒕𝒐𝒖𝒕 = 𝟎. 𝟖𝟔𝟓

𝝏𝑬

𝝏𝑾𝟏,

𝝏𝑬

𝝏𝑾𝟐,

𝝏𝑬

𝝏𝑾𝟑,

𝝏𝑬

𝝏𝑾𝟒,

𝝏𝑬

𝝏𝑾𝟓,

𝝏𝑬

𝝏𝑾𝟔

Page 45: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

Partial Derivatives Calculation

Page 46: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

E−𝑊5 (𝝏𝑬

𝝏𝑾𝟓) Parial Derivative

𝝏𝑬

𝛛𝑾𝟓=

𝛛𝑬

𝛛𝒐𝒖𝒕𝒐𝒖𝒕∗

𝛛𝒐𝒖𝒕𝒐𝒖𝒕

𝛛𝒐𝒖𝒕𝒊𝒏∗

𝛛𝒐𝒖𝒕𝒊𝒏

𝛛𝑾𝟓

Page 47: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

E−𝑊5 (𝝏𝑬

𝝏𝑾𝟓) Parial Derivative

𝝏𝑬

𝛛𝑾𝟓=

𝛛𝑬

𝛛𝒐𝒖𝒕𝒐𝒖𝒕∗

𝛛𝒐𝒖𝒕𝒐𝒖𝒕

𝛛𝒐𝒖𝒕𝒊𝒏∗

𝛛𝒐𝒖𝒕𝒊𝒏

𝛛𝑾𝟓

𝝏𝑬

𝝏𝒐𝒖𝒕𝒐𝒖𝒕=

𝝏

𝝏𝒐𝒖𝒕𝒐𝒖𝒕(𝟏

𝟐𝒅𝒆𝒔𝒊𝒓𝒆𝒅 − 𝒐𝒖𝒕𝒐𝒖𝒕

𝟐)

= 𝟐 ∗𝟏

𝟐𝒅𝒆𝒔𝒊𝒓𝒆𝒅 − 𝒐𝒖𝒕𝒐𝒖𝒕

𝟐−𝟏 ∗ (𝟎 − 𝟏)

= 𝒅𝒆𝒔𝒊𝒓𝒆𝒅 − 𝒐𝒖𝒕𝒐𝒖𝒕 ∗ (−𝟏)𝝏𝑬

𝝏𝒐𝒖𝒕𝒐𝒖𝒕= 𝒐𝒖𝒕𝒐𝒖𝒕 − 𝒅𝒆𝒔𝒊𝒓𝒆𝒅

𝝏𝑬

𝝏𝒐𝒖𝒕𝒐𝒖𝒕= 𝒐𝒖𝒕𝒐𝒖𝒕 − 𝒅𝒆𝒔𝒊𝒓𝒆𝒅 = 𝟎. 𝟖𝟔𝟓 − 𝟎. 𝟎𝟑

𝝏𝑬

𝝏𝒐𝒖𝒕𝒐𝒖𝒕= 𝟎. 𝟖𝟑𝟓

Partial Derivative

Substitution

Page 48: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

E−𝑊5 (𝝏𝑬

𝝏𝑾𝟓) Parial Derivative

𝝏𝑬

𝛛𝑾𝟓=

𝛛𝑬

𝛛𝒐𝒖𝒕𝒐𝒖𝒕∗

𝛛𝒐𝒖𝒕𝒐𝒖𝒕

𝛛𝒐𝒖𝒕𝒊𝒏∗

𝛛𝒐𝒖𝒕𝒊𝒏

𝛛𝑾𝟓

𝝏𝒐𝒖𝒕𝒐𝒖𝒕

𝝏𝒐𝒖𝒕𝒊𝒏=

𝝏

𝝏𝒐𝒖𝒕𝒊𝒏(

𝟏

𝟏 + 𝒆−𝒐𝒖𝒕𝒊𝒏)

𝝏𝒐𝒖𝒕𝒐𝒖𝒕

𝝏𝒐𝒖𝒕𝒊𝒏= (

𝟏

𝟏 + 𝒆−𝒐𝒖𝒕𝒊𝒏)(𝟏 −

𝟏

𝟏 + 𝒆−𝒐𝒖𝒕𝒊𝒏)

𝜕𝒐𝒖𝒕𝒐𝒖𝒕

𝜕𝒐𝒖𝒕𝒊𝒏= (

𝟏

𝟏 + 𝒆−𝟏.𝟖𝟓𝟖)(𝟏 −

𝟏

𝟏 + 𝒆−𝟏.𝟖𝟓𝟖)

= (𝟏

𝟏. 𝟓𝟔)(𝟏 −

𝟏

𝟏. 𝟓𝟔)

= 𝟎. 𝟔𝟒𝟏 𝟏 − 𝟎. 𝟔𝟒𝟏 = 𝟎. 𝟔𝟒𝟏 𝟎. 𝟑𝟓𝟗𝝏𝒐𝒖𝒕𝒐𝒖𝒕

𝝏𝒐𝒖𝒕𝒊𝒏= 𝟎. 𝟐𝟑

Partial Derivative

Substitution

Page 49: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

E−𝑊5 (𝝏𝑬

𝝏𝑾𝟓) Parial Derivative

𝝏𝑬

𝛛𝑾𝟓=

𝛛𝑬

𝛛𝒐𝒖𝒕𝒐𝒖𝒕∗

𝛛𝒐𝒖𝒕𝒐𝒖𝒕

𝛛𝒐𝒖𝒕𝒊𝒏∗

𝛛𝒐𝒖𝒕𝒊𝒏

𝛛𝑾𝟓

𝝏𝒐𝒖𝒕𝒊𝒏

𝝏𝑾𝟓=

𝝏

𝝏𝑾𝟓(𝒉𝟏𝒐𝒖𝒕 ∗ 𝑾𝟓 + 𝒉𝟐𝒐𝒖𝒕 ∗ 𝑾𝟔 + 𝒃𝟑)

= 𝟏 ∗ 𝒉𝟏𝒐𝒖𝒕 ∗ (𝑾𝟓)𝟏−𝟏+ 𝟎 + 𝟎

𝝏𝒐𝒖𝒕𝒊𝒏

𝝏𝑾𝟓= 𝒉𝟏𝒐𝒖𝒕

𝝏𝒐𝒖𝒕𝒊𝒏

𝝏𝑾𝟓= 𝒉𝟏𝒐𝒖𝒕

𝝏𝒐𝒖𝒕𝒊𝒏

𝝏𝑾𝟓= 𝟎. 𝟔𝟏𝟖

Partial Derivative

Substitution

Page 50: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

E−𝑊5 (𝝏𝑬

𝝏𝑾𝟓) Parial Derivative

𝝏𝑬

𝛛𝑾𝟓=

𝛛𝑬

𝛛𝒐𝒖𝒕𝒐𝒖𝒕∗

𝛛𝒐𝒖𝒕𝒐𝒖𝒕

𝛛𝒐𝒖𝒕𝒊𝒏∗

𝛛𝒐𝒖𝒕𝒊𝒏

𝛛𝑾𝟓

𝝏𝒐𝒖𝒕𝒊𝒏

𝝏𝑾𝟓= 𝟎. 𝟔𝟏𝟖

𝝏𝒐𝒖𝒕𝒐𝒖𝒕

𝝏𝒐𝒖𝒕𝒊𝒏= 𝟎. 𝟐𝟑

𝝏𝑬

𝝏𝒐𝒖𝒕𝒐𝒖𝒕= 𝟎. 𝟖𝟑𝟓

𝝏𝑬

𝝏𝑾𝟓= 𝟎. 𝟖𝟑𝟓 ∗ 𝟎. 𝟐𝟑 ∗ 𝟎. 𝟔𝟏𝟖

𝝏𝑬

𝝏𝑾𝟓= 𝟎. 𝟏𝟏𝟗

Page 51: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

E−𝑊6 (𝝏𝑬

𝝏𝑾𝟔) Parial Derivative

𝝏𝑬

𝛛𝑾𝟔=

𝛛𝑬

𝛛𝒐𝒖𝒕𝒐𝒖𝒕∗

𝛛𝒐𝒖𝒕𝒐𝒖𝒕

𝛛𝒐𝒖𝒕𝒊𝒏∗

𝛛𝒐𝒖𝒕𝒊𝒏

𝛛𝑾𝟔

Page 52: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

E−𝑊6 (𝝏𝑬

𝝏𝑾𝟔) Parial Derivative

𝝏𝑬

𝛛𝑾𝟔=

𝛛𝑬

𝛛𝒐𝒖𝒕𝒐𝒖𝒕∗

𝛛𝒐𝒖𝒕𝒐𝒖𝒕

𝛛𝒐𝒖𝒕𝒊𝒏∗

𝛛𝒐𝒖𝒕𝒊𝒏

𝛛𝑾𝟔

𝝏𝒐𝒖𝒕𝒐𝒖𝒕

𝝏𝒐𝒖𝒕𝒊𝒏= 𝟎. 𝟐𝟑

𝝏𝑬

𝝏𝒐𝒖𝒕𝒐𝒖𝒕= 𝟎. 𝟖𝟑𝟓

Page 53: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

E−𝑊6 (𝝏𝑬

𝝏𝑾𝟔) Parial Derivative

𝝏𝑬

𝛛𝑾𝟓=

𝛛𝑬

𝛛𝒐𝒖𝒕𝒐𝒖𝒕∗

𝛛𝒐𝒖𝒕𝒐𝒖𝒕

𝛛𝒐𝒖𝒕𝒊𝒏∗

𝛛𝒐𝒖𝒕𝒊𝒏

𝛛𝑾𝟔

𝝏𝒐𝒖𝒕𝒊𝒏

𝝏𝑾𝟔=

𝝏

𝝏𝑾𝟔(𝒉𝟏𝒐𝒖𝒕 ∗ 𝑾𝟓 + 𝒉𝟐𝒐𝒖𝒕 ∗ 𝑾𝟔 + 𝒃𝟑)

= 𝟎 + 𝟏 ∗ 𝒉𝟐𝒐𝒖𝒕 ∗ (𝑾𝟔)𝟏−𝟏+𝟎

𝝏𝒐𝒖𝒕𝒊𝒏

𝝏𝑾𝟔= 𝒉𝟐𝒐𝒖𝒕

𝝏𝒐𝒖𝒕𝒊𝒏

𝝏𝑾𝟔= 𝒉𝟐𝒐𝒖𝒕

𝝏𝒐𝒖𝒕𝒊𝒏

𝝏𝑾𝟔= 𝟎. 𝟓𝟎𝟔

Partial Derivative

Substitution

Page 54: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

E−𝑊6 (𝝏𝑬

𝝏𝑾𝟔) Parial Derivative

𝝏𝑬

𝛛𝑾𝟔=

𝛛𝑬

𝛛𝒐𝒖𝒕𝒐𝒖𝒕∗

𝛛𝒐𝒖𝒕𝒐𝒖𝒕

𝛛𝒐𝒖𝒕𝒊𝒏∗

𝛛𝒐𝒖𝒕𝒊𝒏

𝛛𝑾𝟔

𝝏𝒐𝒖𝒕𝒐𝒖𝒕

𝝏𝒐𝒖𝒕𝒊𝒏= 𝟎. 𝟐𝟑

𝝏𝑬

𝝏𝒐𝒖𝒕𝒐𝒖𝒕= 𝟎. 𝟖𝟑𝟓

𝝏𝒐𝒖𝒕𝒊𝒏

𝝏𝑾𝟔= 𝟎. 𝟓𝟎𝟔

𝝏𝑬

𝛛𝑾𝟔= 𝟎. 𝟖𝟑𝟓 ∗ 𝟎. 𝟐𝟑 ∗ 𝟎. 𝟓𝟎𝟔

𝛛𝑬

𝛛𝑾𝟔= 𝟎. 𝟎𝟗𝟕

Page 55: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

E−𝑊1 (𝝏𝑬

𝝏𝑾𝟏) Parial Derivative

𝝏𝑬

𝛛𝑾𝟏=

𝛛𝑬

𝛛𝒐𝒖𝒕𝒐𝒖𝒕∗

𝛛𝒐𝒖𝒕𝒐𝒖𝒕

𝛛𝒐𝒖𝒕𝒊𝒏∗

𝛛𝒐𝒖𝒕𝒊𝒏

𝛛𝒉𝟏𝒐𝒖𝒕∗

𝛛𝒉𝟏𝒐𝒖𝒕

𝛛𝒉𝟏𝒊𝒏∗

𝛛𝒉𝟏𝒊𝒏

𝛛𝑾𝟏

Page 56: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

E−𝑊1 (𝝏𝑬

𝝏𝑾𝟏) Parial Derivative

𝝏𝑬

𝛛𝑾𝟏=

𝛛𝑬

𝛛𝒐𝒖𝒕𝒐𝒖𝒕∗

𝛛𝒐𝒖𝒕𝒐𝒖𝒕

𝛛𝒐𝒖𝒕𝒊𝒏∗

𝛛𝒐𝒖𝒕𝒊𝒏

𝛛𝒉𝟏𝒐𝒖𝒕∗

𝛛𝒉𝟏𝒐𝒖𝒕

𝛛𝒉𝟏𝒊𝒏∗

𝛛𝒉𝟏𝒊𝒏

𝛛𝑾𝟏

𝝏𝒐𝒖𝒕𝒐𝒖𝒕

𝝏𝒐𝒖𝒕𝒊𝒏= 𝟎. 𝟐𝟑

𝝏𝑬

𝝏𝒐𝒖𝒕𝒐𝒖𝒕= 𝟎. 𝟖𝟑𝟓

Page 57: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

E−𝑊1 (𝝏𝑬

𝝏𝑾𝟏) Parial Derivative

𝝏𝑬

𝛛𝑾𝟏=

𝛛𝑬

𝛛𝒐𝒖𝒕𝒐𝒖𝒕∗

𝛛𝒐𝒖𝒕𝒐𝒖𝒕

𝛛𝒐𝒖𝒕𝒊𝒏∗

𝛛𝒐𝒖𝒕𝒊𝒏

𝛛𝒉𝟏𝒐𝒖𝒕∗

𝛛𝒉𝟏𝒐𝒖𝒕

𝛛𝒉𝟏𝒊𝒏∗

𝛛𝒉𝟏𝒊𝒏

𝛛𝑾𝟏

Partial Derivative

Substitution

𝝏𝒐𝒖𝒕𝒊𝒏

𝝏𝒉𝟏𝒐𝒖𝒕=

𝝏

𝝏𝒉𝟏𝒐𝒖𝒕(𝒉𝟏𝒐𝒖𝒕 ∗ 𝑾𝟓 + 𝒉𝟐𝒐𝒖𝒕 ∗ 𝑾𝟔 + 𝒃𝟑)

= (𝒉𝟏𝒐𝒖𝒕)𝟏−𝟏∗ 𝑾𝟓 + 𝟎 + 𝟎

𝝏𝒐𝒖𝒕𝒊𝒏

𝝏𝒉𝟏𝒐𝒖𝒕= 𝑾𝟓

𝝏𝒐𝒖𝒕𝒊𝒏

𝝏𝒉𝟏𝒐𝒖𝒕= 𝑾𝟓

𝛛𝒐𝒖𝒕𝒊𝒏

𝛛𝒉𝟏𝒐𝒖𝒕= −𝟎. 𝟐

Page 58: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

E−𝑊1 (𝝏𝑬

𝝏𝑾𝟏) Parial Derivative

𝝏𝑬

𝛛𝑾𝟏=

𝛛𝑬

𝛛𝒐𝒖𝒕𝒐𝒖𝒕∗

𝛛𝒐𝒖𝒕𝒐𝒖𝒕

𝛛𝒐𝒖𝒕𝒊𝒏∗

𝛛𝒐𝒖𝒕𝒊𝒏

𝛛𝒉𝟏𝒐𝒖𝒕∗

𝛛𝒉𝟏𝒐𝒖𝒕

𝛛𝒉𝟏𝒊𝒏∗

𝛛𝒉𝟏𝒊𝒏

𝛛𝑾𝟏

Partial Derivative

Substitution

𝝏𝒉𝟏𝒐𝒖𝒕

𝝏𝒉𝟏𝒊𝒏=

𝝏

𝝏𝒉𝟏𝒊𝒏(

𝟏

𝟏 + 𝒆−𝒉𝟏𝒊𝒏)

𝝏𝒉𝟏𝒐𝒖𝒕

𝝏𝒉𝟏𝒊𝒏= (

𝟏

𝟏 + 𝒆−𝒉𝟏𝒊𝒏)(𝟏 −

𝟏

𝟏 + 𝒆−𝒉𝟏𝒊𝒏)

𝝏𝒉𝟏𝒐𝒖𝒕

𝝏𝒉𝟏𝒊𝒏= (

𝟏

𝟏 + 𝒆−𝒉𝟏𝒊𝒏)(𝟏 −

𝟏

𝟏 + 𝒆−𝒉𝟏𝒊𝒏)

= (𝟏

𝟏 + 𝒆−𝟎.𝟒𝟖)(𝟏 −

𝟏

𝟏 + 𝒆−𝟎.𝟒𝟖)

𝝏𝒉𝟐𝒐𝒖𝒕

𝝏𝒉𝟐𝒊𝒏= 𝟎. 𝟐𝟑𝟔

Page 59: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

E−𝑊1 (𝝏𝑬

𝝏𝑾𝟏) Parial Derivative

𝝏𝑬

𝛛𝑾𝟏=

𝛛𝑬

𝛛𝒐𝒖𝒕𝒐𝒖𝒕∗

𝛛𝒐𝒖𝒕𝒐𝒖𝒕

𝛛𝒐𝒖𝒕𝒊𝒏∗

𝛛𝒐𝒖𝒕𝒊𝒏

𝛛𝒉𝟏𝒐𝒖𝒕∗

𝛛𝒉𝟏𝒐𝒖𝒕

𝛛𝒉𝟏𝒊𝒏∗

𝛛𝒉𝟏𝒊𝒏

𝛛𝑾𝟏

Partial Derivative

Substitution

𝝏𝒉𝟏𝒊𝒏

𝝏𝑾𝟏=

𝝏

𝝏𝑾𝟏(𝑿𝟏 ∗ 𝑾𝟏 + 𝑿𝟐 ∗ 𝑾𝟐 + 𝒃𝟏)

= 𝑿𝟏 ∗ (𝑾𝟏)𝟏−𝟏+ 𝟎 + 𝟎

𝝏𝒉𝟏𝒊𝒏

𝝏𝑾𝟏= 𝑿𝟏

𝝏𝒉𝟏𝒊𝒏

𝝏𝑾𝟏= 𝑿𝟏

𝝏𝒉𝟏𝒊𝒏

𝝏𝑾𝟏= 𝟎. 𝟏

Page 60: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

E−𝑊1 (𝝏𝑬

𝝏𝑾𝟏) Parial Derivative

𝝏𝑬

𝛛𝑾𝟏=

𝛛𝑬

𝛛𝒐𝒖𝒕𝒐𝒖𝒕∗

𝛛𝒐𝒖𝒕𝒐𝒖𝒕

𝛛𝒐𝒖𝒕𝒊𝒏∗

𝛛𝒐𝒖𝒕𝒊𝒏

𝛛𝒉𝟏𝒐𝒖𝒕∗

𝛛𝒉𝟏𝒐𝒖𝒕

𝛛𝒉𝟏𝒊𝒏∗

𝛛𝒉𝟏𝒊𝒏

𝛛𝑾𝟏

𝝏𝒐𝒖𝒕𝒐𝒖𝒕

𝝏𝒐𝒖𝒕𝒊𝒏= 𝟎. 𝟐𝟑

𝝏𝑬

𝝏𝒐𝒖𝒕𝒐𝒖𝒕= 𝟎. 𝟖𝟑𝟓

𝝏𝒉𝟏𝒊𝒏

𝝏𝑾𝟏= 𝟎. 𝟏

𝝏𝒉𝟐𝒐𝒖𝒕

𝝏𝒉𝟐𝒊𝒏= 𝟎. 𝟐𝟑𝟔

𝛛𝒐𝒖𝒕𝒊𝒏

𝛛𝒉𝟏𝒐𝒖𝒕= −𝟎. 𝟐

𝝏𝑬

𝝏𝑾𝟏= 𝟎. 𝟖𝟑𝟓 ∗ 𝟎. 𝟐𝟑 ∗ −𝟎. 𝟐 ∗ 𝟎. 𝟐𝟑𝟔 ∗ 𝟎. 𝟏

𝝏𝑬

𝝏𝑾𝟏= −𝟎. 𝟎𝟎𝟏

Page 61: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

E−𝑊2 (𝝏𝑬

𝝏𝑾𝟐) Parial Derivative:

𝝏𝑬

𝛛𝑾𝟐=

𝛛𝑬

𝛛𝒐𝒖𝒕𝒐𝒖𝒕∗

𝛛𝒐𝒖𝒕𝒐𝒖𝒕

𝛛𝒐𝒖𝒕𝒊𝒏∗

𝛛𝒐𝒖𝒕𝒊𝒏

𝛛𝒉𝟏𝒐𝒖𝒕∗

𝛛𝒉𝟏𝒐𝒖𝒕

𝛛𝒉𝟏𝒊𝒏∗

𝛛𝒉𝟏𝒊𝒏

𝛛𝑾𝟐

Page 62: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

E−𝑊2 (𝝏𝑬

𝝏𝑾𝟐) Parial Derivative:

𝝏𝑬

𝛛𝑾𝟐=

𝛛𝑬

𝛛𝒐𝒖𝒕𝒐𝒖𝒕∗

𝛛𝒐𝒖𝒕𝒐𝒖𝒕

𝛛𝒐𝒖𝒕𝒊𝒏∗

𝛛𝒐𝒖𝒕𝒊𝒏

𝛛𝒉𝟏𝒐𝒖𝒕∗

𝛛𝒉𝟏𝒐𝒖𝒕

𝛛𝒉𝟏𝒊𝒏∗

𝛛𝒉𝟏𝒊𝒏

𝛛𝑾𝟐

𝝏𝒐𝒖𝒕𝒐𝒖𝒕

𝝏𝒐𝒖𝒕𝒊𝒏= 𝟎. 𝟐𝟑

𝝏𝑬

𝝏𝒐𝒖𝒕𝒐𝒖𝒕= 𝟎. 𝟖𝟑𝟓

𝝏𝒉𝟐𝒐𝒖𝒕

𝝏𝒉𝟐𝒊𝒏= 𝟎. 𝟐𝟑𝟔

𝛛𝒐𝒖𝒕𝒊𝒏

𝛛𝒉𝟏𝒐𝒖𝒕= −𝟎. 𝟐

Page 63: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

E−𝑊2 (𝝏𝑬

𝝏𝑾𝟐) Parial Derivative:

Partial Derivative

Substitution

𝝏𝒉𝟏𝒊𝒏

𝝏𝑾𝟐=

𝝏

𝝏𝑾𝟐(𝑿𝟏 ∗ 𝑾𝟏 + 𝑿𝟐 ∗ 𝑾𝟐 + 𝒃𝟏)

= 𝟎 + 𝑿𝟐 ∗ (𝑾𝟐)𝟏−𝟏+𝟎

𝝏𝒉𝟏𝒊𝒏

𝝏𝑾𝟐= 𝑿𝟐

𝝏𝒉𝟏𝒊𝒏

𝝏𝑾𝟐= 𝑿𝟐

𝝏𝒉𝟏𝒊𝒏

𝝏𝑾𝟐= 𝟎. 𝟑

𝝏𝑬

𝛛𝑾𝟐=

𝛛𝑬

𝛛𝒐𝒖𝒕𝒐𝒖𝒕∗

𝛛𝒐𝒖𝒕𝒐𝒖𝒕

𝛛𝒐𝒖𝒕𝒊𝒏∗

𝛛𝒐𝒖𝒕𝒊𝒏

𝛛𝒉𝟏𝒐𝒖𝒕∗

𝛛𝒉𝟏𝒐𝒖𝒕

𝛛𝒉𝟏𝒊𝒏∗

𝛛𝒉𝟏𝒊𝒏

𝛛𝑾𝟐

Page 64: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

E−𝑊2 (𝝏𝑬

𝝏𝑾𝟐) Parial Derivative:

𝝏𝒐𝒖𝒕𝒐𝒖𝒕

𝝏𝒐𝒖𝒕𝒊𝒏= 𝟎. 𝟐𝟑

𝝏𝑬

𝝏𝒐𝒖𝒕𝒐𝒖𝒕= 𝟎. 𝟖𝟑𝟓

𝝏𝒉𝟐𝒐𝒖𝒕

𝝏𝒉𝟐𝒊𝒏= 𝟎. 𝟐𝟑𝟔

𝛛𝒐𝒖𝒕𝒊𝒏

𝛛𝒉𝟏𝒐𝒖𝒕= −𝟎. 𝟐

𝝏𝒉𝟏𝒊𝒏

𝝏𝑾𝟐= 𝟎. 𝟑

𝝏𝑬

𝝏𝑾𝟐= 𝟎. 𝟖𝟑𝟓 ∗ 𝟎. 𝟐𝟑 ∗ −𝟎. 𝟐 ∗ 𝟎. 𝟐𝟑𝟔 ∗ 𝟎. 𝟑

𝝏𝑬

𝝏𝑾𝟐= −. 𝟎𝟎𝟑

𝝏𝑬

𝛛𝑾𝟐=

𝛛𝑬

𝛛𝒐𝒖𝒕𝒐𝒖𝒕∗

𝛛𝒐𝒖𝒕𝒐𝒖𝒕

𝛛𝒐𝒖𝒕𝒊𝒏∗

𝛛𝒐𝒖𝒕𝒊𝒏

𝛛𝒉𝟏𝒐𝒖𝒕∗

𝛛𝒉𝟏𝒐𝒖𝒕

𝛛𝒉𝟏𝒊𝒏∗

𝛛𝒉𝟏𝒊𝒏

𝛛𝑾𝟐

Page 65: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

E−𝑊3 (𝝏𝑬

𝝏𝑾𝟑) Parial Derivative:

𝝏𝑬

𝛛𝑾𝟑=

𝛛𝑬

𝛛𝒐𝒖𝒕𝒐𝒖𝒕∗

𝛛𝒐𝒖𝒕𝒐𝒖𝒕

𝛛𝒐𝒖𝒕𝒊𝒏∗

𝛛𝒐𝒖𝒕𝒊𝒏

𝛛𝒉𝟐𝒐𝒖𝒕∗

𝛛𝒉𝟐𝒐𝒖𝒕

𝛛𝒉𝟐𝒊𝒏∗

𝛛𝒉𝟐𝒊𝒏

𝛛𝑾𝟑

Page 66: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

E−𝑊3 (𝝏𝑬

𝝏𝑾𝟑) Parial Derivative:

𝝏𝑬

𝛛𝑾𝟑=

𝛛𝑬

𝛛𝒐𝒖𝒕𝒐𝒖𝒕∗

𝛛𝒐𝒖𝒕𝒐𝒖𝒕

𝛛𝒐𝒖𝒕𝒊𝒏∗

𝛛𝒐𝒖𝒕𝒊𝒏

𝛛𝒉𝟐𝒐𝒖𝒕∗

𝛛𝒉𝟐𝒐𝒖𝒕

𝛛𝒉𝟐𝒊𝒏∗

𝛛𝒉𝟐𝒊𝒏

𝛛𝑾𝟑

𝝏𝒐𝒖𝒕𝒐𝒖𝒕

𝝏𝒐𝒖𝒕𝒊𝒏= 𝟎. 𝟐𝟑

𝝏𝑬

𝝏𝒐𝒖𝒕𝒐𝒖𝒕= 𝟎. 𝟖𝟑𝟓

Page 67: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

E−𝑊3 (𝝏𝑬

𝝏𝑾𝟑) Parial Derivative:

𝝏𝒐𝒖𝒕𝒊𝒏

𝝏𝒉𝟐𝒐𝒖𝒕=

𝝏

𝝏𝒉𝟐𝒐𝒖𝒕(𝒉𝟏𝒐𝒖𝒕 ∗ 𝑾𝟓 + 𝒉𝟐𝒐𝒖𝒕 ∗ 𝑾𝟔 + 𝒃𝟑)

= 𝟎 + (𝒉𝟐𝒐𝒖𝒕)𝟏−𝟏∗ 𝑾𝟔 + 𝟎

𝝏𝒐𝒖𝒕𝒊𝒏

𝝏𝒉𝟐𝒐𝒖𝒕= 𝑾𝟔

Partial Derivative

Substitution 𝝏𝒐𝒖𝒕𝒊𝒏

𝝏𝒉𝟐𝒐𝒖𝒕= 𝑾𝟔

𝝏𝒐𝒖𝒕𝒊𝒏

𝝏𝒉𝟐𝒐𝒖𝒕= 𝟎. 𝟑

𝝏𝑬

𝛛𝑾𝟑=

𝛛𝑬

𝛛𝒐𝒖𝒕𝒐𝒖𝒕∗

𝛛𝒐𝒖𝒕𝒐𝒖𝒕

𝛛𝒐𝒖𝒕𝒊𝒏∗

𝛛𝒐𝒖𝒕𝒊𝒏

𝛛𝒉𝟐𝒐𝒖𝒕∗

𝛛𝒉𝟐𝒐𝒖𝒕

𝛛𝒉𝟐𝒊𝒏∗

𝛛𝒉𝟐𝒊𝒏

𝛛𝑾𝟑

Page 68: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

E−𝑊3 (𝝏𝑬

𝝏𝑾𝟑) Parial Derivative:

𝝏𝒉𝟐𝒐𝒖𝒕

𝝏𝒉𝟐𝒊𝒏=

𝝏

𝝏𝒉𝟐𝒊𝒏(

𝟏

𝟏 + 𝒆−𝒉𝟐𝒊𝒏)

𝝏𝒉𝟐𝒐𝒖𝒕

𝝏𝒉𝟐𝒊𝒏= (

𝟏

𝟏 + 𝒆−𝒉𝟐𝒊𝒏)(𝟏 −

𝟏

𝟏 + 𝒆−𝒉𝟐𝒊𝒏)

Partial Derivative

Substitution𝝏𝒉𝟐𝒐𝒖𝒕

𝝏𝒉𝟐𝒊𝒏= (

𝟏

𝟏 + 𝒆−𝒉𝟐𝒊𝒏)(𝟏 −

𝟏

𝟏 + 𝒆−𝒉𝟐𝒊𝒏)

= (𝟏

𝟏 + 𝒆−𝟎.𝟎𝟐𝟐)(𝟏 −

𝟏

𝟏 + 𝒆−𝟎.𝟎𝟐𝟐)

𝝏𝒉𝟐𝒐𝒖𝒕

𝝏𝒉𝟐𝒊𝒏= 𝟎. 𝟐𝟓

𝝏𝑬

𝛛𝑾𝟑=

𝛛𝑬

𝛛𝒐𝒖𝒕𝒐𝒖𝒕∗

𝛛𝒐𝒖𝒕𝒐𝒖𝒕

𝛛𝒐𝒖𝒕𝒊𝒏∗

𝛛𝒐𝒖𝒕𝒊𝒏

𝛛𝒉𝟐𝒐𝒖𝒕∗

𝛛𝒉𝟐𝒐𝒖𝒕

𝛛𝒉𝟐𝒊𝒏∗

𝛛𝒉𝟐𝒊𝒏

𝛛𝑾𝟑

Page 69: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

E−𝑊3 (𝝏𝑬

𝝏𝑾𝟑) Parial Derivative:

𝝏𝒉𝟐𝒊𝒏

𝝏𝑾𝟑=

𝝏

𝝏𝑾𝟑(𝑿𝟏 ∗ 𝑾𝟑 + 𝑿𝟐 ∗ 𝑾𝟒 + 𝒃𝟐)

= 𝑿𝟏 ∗ 𝑾𝟑 + 𝑿𝟐 ∗ 𝑾𝟒 + 𝒃𝟐

= (𝑿𝟏)𝟏−𝟏∗ 𝑾𝟑 + 𝟎 + 𝟎

𝝏𝒉𝟐𝒊𝒏

𝝏𝑾𝟑= 𝑾𝟑

Partial Derivative

Substitution𝝏𝒉𝟐𝒊𝒏

𝝏𝑾𝟑= 𝑾𝟑

𝝏𝒉𝟐𝒊𝒏

𝝏𝑾𝟑= 𝟎. 𝟔𝟐

𝝏𝑬

𝛛𝑾𝟑=

𝛛𝑬

𝛛𝒐𝒖𝒕𝒐𝒖𝒕∗

𝛛𝒐𝒖𝒕𝒐𝒖𝒕

𝛛𝒐𝒖𝒕𝒊𝒏∗

𝛛𝒐𝒖𝒕𝒊𝒏

𝛛𝒉𝟐𝒐𝒖𝒕∗

𝛛𝒉𝟐𝒐𝒖𝒕

𝛛𝒉𝟐𝒊𝒏∗

𝛛𝒉𝟐𝒊𝒏

𝛛𝑾𝟑

Page 70: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

E−𝑊3 (𝝏𝑬

𝝏𝑾𝟑) Parial Derivative:

𝝏𝒐𝒖𝒕𝒐𝒖𝒕

𝝏𝒐𝒖𝒕𝒊𝒏= 𝟎. 𝟐𝟑

𝝏𝑬

𝝏𝒐𝒖𝒕𝒐𝒖𝒕= 𝟎. 𝟖𝟑𝟓 𝝏𝒐𝒖𝒕𝒊𝒏

𝝏𝒉𝟐𝒐𝒖𝒕= 𝟎. 𝟑

𝝏𝒉𝟐𝒐𝒖𝒕

𝝏𝒉𝟐𝒊𝒏= 𝟎. 𝟐𝟓

𝝏𝒉𝟐𝒊𝒏

𝝏𝑾𝟑= 𝟎. 𝟔𝟐

𝝏𝑬

𝝏𝑾𝟑= 𝟎. 𝟖𝟑𝟓 ∗ 𝟎. 𝟐𝟑 ∗ 𝟎. 𝟑 ∗ 𝟎. 𝟐𝟓 ∗ 𝟎. 𝟔𝟐

𝝏𝑬

𝝏𝑾𝟑= 𝟎. 𝟎𝟎𝟗

𝝏𝑬

𝛛𝑾𝟑=

𝛛𝑬

𝛛𝒐𝒖𝒕𝒐𝒖𝒕∗

𝛛𝒐𝒖𝒕𝒐𝒖𝒕

𝛛𝒐𝒖𝒕𝒊𝒏∗

𝛛𝒐𝒖𝒕𝒊𝒏

𝛛𝒉𝟐𝒐𝒖𝒕∗

𝛛𝒉𝟐𝒐𝒖𝒕

𝛛𝒉𝟐𝒊𝒏∗

𝛛𝒉𝟐𝒊𝒏

𝛛𝑾𝟑

Page 71: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

E−𝑊4 (𝝏𝑬

𝝏𝑾𝟒) Parial Derivative:

𝝏𝑬

𝛛𝑾𝟒=

𝛛𝑬

𝛛𝒐𝒖𝒕𝒐𝒖𝒕∗

𝛛𝒐𝒖𝒕𝒐𝒖𝒕

𝛛𝒐𝒖𝒕𝒊𝒏∗

𝛛𝒐𝒖𝒕𝒊𝒏

𝛛𝒉𝟐𝒐𝒖𝒕∗

𝛛𝒉𝟐𝒐𝒖𝒕

𝛛𝒉𝟐𝒊𝒏∗

𝛛𝒉𝟐𝒊𝒏

𝛛𝑾𝟒

Page 72: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

E−𝑊4 (𝝏𝑬

𝝏𝑾𝟒) Parial Derivative:

𝝏𝒐𝒖𝒕𝒐𝒖𝒕

𝝏𝒐𝒖𝒕𝒊𝒏= 𝟎. 𝟐𝟑

𝝏𝑬

𝝏𝒐𝒖𝒕𝒐𝒖𝒕= 𝟎. 𝟖𝟑𝟓 𝝏𝒐𝒖𝒕𝒊𝒏

𝝏𝒉𝟐𝒐𝒖𝒕= 𝟎. 𝟑

𝝏𝒉𝟐𝒐𝒖𝒕

𝝏𝒉𝟐𝒊𝒏= 𝟎. 𝟐𝟓

𝝏𝑬

𝛛𝑾𝟒=

𝛛𝑬

𝛛𝒐𝒖𝒕𝒐𝒖𝒕∗

𝛛𝒐𝒖𝒕𝒐𝒖𝒕

𝛛𝒐𝒖𝒕𝒊𝒏∗

𝛛𝒐𝒖𝒕𝒊𝒏

𝛛𝒉𝟐𝒐𝒖𝒕∗

𝛛𝒉𝟐𝒐𝒖𝒕

𝛛𝒉𝟐𝒊𝒏∗

𝛛𝒉𝟐𝒊𝒏

𝛛𝑾𝟒

Page 73: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

E−𝑊4 (𝝏𝑬

𝝏𝑾𝟒) Parial Derivative:

𝝏𝒉𝟐𝒊𝒏

𝝏𝑾𝟒=

𝝏

𝝏𝑾𝟒(𝑿𝟏 ∗ 𝑾𝟑 + 𝑿𝟐 ∗ 𝑾𝟒 + 𝒃𝟐)

= 𝑿𝟏 ∗ 𝑾𝟑 + 𝑿𝟐 ∗ 𝑾𝟒 + 𝒃𝟐

= 𝟎 + (𝑿𝟐)𝟏−𝟏∗ 𝑾𝟒 + 𝟎

𝝏𝒉𝟐𝒊𝒏

𝝏𝑾𝟒= 𝑾𝟒

𝝏𝒉𝟐𝒊𝒏

𝝏𝑾𝟒= 𝑾𝟒

𝝏𝒉𝟐𝒊𝒏

𝝏𝑾𝟒= 𝟎. 𝟐

Partial Derivative

Substitution

𝝏𝑬

𝛛𝑾𝟒=

𝛛𝑬

𝛛𝒐𝒖𝒕𝒐𝒖𝒕∗

𝛛𝒐𝒖𝒕𝒐𝒖𝒕

𝛛𝒐𝒖𝒕𝒊𝒏∗

𝛛𝒐𝒖𝒕𝒊𝒏

𝛛𝒉𝟐𝒐𝒖𝒕∗

𝛛𝒉𝟐𝒐𝒖𝒕

𝛛𝒉𝟐𝒊𝒏∗

𝛛𝒉𝟐𝒊𝒏

𝛛𝑾𝟒

Page 74: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

E−𝑊4 (𝝏𝑬

𝝏𝑾𝟒) Parial Derivative:

𝝏𝒐𝒖𝒕𝒐𝒖𝒕

𝝏𝒐𝒖𝒕𝒊𝒏= 𝟎. 𝟐𝟑

𝝏𝑬

𝝏𝒐𝒖𝒕𝒐𝒖𝒕= 𝟎. 𝟖𝟑𝟓 𝝏𝒐𝒖𝒕𝒊𝒏

𝝏𝒉𝟐𝒐𝒖𝒕= 𝟎. 𝟑

𝝏𝒉𝟐𝒐𝒖𝒕

𝝏𝒉𝟐𝒊𝒏= 𝟎. 𝟐𝟓

𝝏𝒉𝟐𝒊𝒏

𝝏𝑾𝟒= 𝟎. 𝟐

𝝏𝑬

𝝏𝑾𝟒= 𝟎. 𝟖𝟑𝟓 ∗ 𝟎. 𝟐𝟑 ∗ 𝟎. 𝟑 ∗ 𝟎. 𝟐𝟓 ∗ 𝟎. 𝟐

𝝏𝑬

𝝏𝑾𝟒= 𝟎. 𝟎𝟎𝟑

𝝏𝑬

𝛛𝑾𝟒=

𝛛𝑬

𝛛𝒐𝒖𝒕𝒐𝒖𝒕∗

𝛛𝒐𝒖𝒕𝒐𝒖𝒕

𝛛𝒐𝒖𝒕𝒊𝒏∗

𝛛𝒐𝒖𝒕𝒊𝒏

𝛛𝒉𝟐𝒐𝒖𝒕∗

𝛛𝒉𝟐𝒐𝒖𝒕

𝛛𝒉𝟐𝒊𝒏∗

𝛛𝒉𝟐𝒊𝒏

𝛛𝑾𝟒

Page 75: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

All Error-Weights Partial Derivatives

𝝏𝑬

𝝏𝑾𝟒= 𝟎. 𝟎𝟎𝟑

𝝏𝑬

𝝏𝑾𝟑= 𝟎. 𝟎𝟎𝟗

𝝏𝑬

𝝏𝑾𝟐= −. 𝟎𝟎𝟑

𝝏𝑬

𝝏𝑾𝟏= −𝟎. 𝟎𝟎𝟏

𝛛𝑬

𝛛𝑾𝟔= 𝟎. 𝟎𝟗𝟕

𝝏𝑬

𝝏𝑾𝟓= 𝟎. 𝟏𝟏𝟗

Page 76: Backpropagation: Understanding How to Update ANNs Weights Step-by-Step

Updated Weights𝑾𝟏𝒏𝒆𝒘 = 𝑾𝟏 − η ∗

𝝏𝑬

𝝏𝑾𝟏= 𝟎. 𝟓 − 𝟎. 𝟎𝟏 ∗ −𝟎. 𝟎𝟎𝟏 = 𝟎. 𝟓𝟎𝟎𝟎𝟏

𝑾𝟐𝒏𝒆𝒘 = 𝑾𝟐 − η ∗𝝏𝑬

𝝏𝑾𝟐= 𝟎. 𝟏 − 𝟎. 𝟎𝟏 ∗ −𝟎. 𝟎𝟎𝟑 = 𝟎. 𝟏𝟎𝟎𝟎𝟑

𝑾𝟑𝒏𝒆𝒘 = 𝑾𝟑 − η ∗𝝏𝑬

𝝏𝑾𝟑= 𝟎. 𝟔𝟐 − 𝟎. 𝟎𝟏 ∗ 𝟎. 𝟎𝟎𝟗 = 𝟎. 𝟔𝟏𝟗𝟗𝟏

𝑾𝟒𝒏𝒆𝒘 = 𝑾𝟒 − η ∗𝝏𝑬

𝝏𝑾𝟒= 𝟎. 𝟐 − 𝟎. 𝟎𝟏 ∗ 𝟎. 𝟎𝟎𝟑 = 𝟎. 𝟏𝟗𝟗𝟕

𝑾𝟓𝒏𝒆𝒘 = 𝑾𝟓 − η ∗𝝏𝑬

𝝏𝑾𝟓= −𝟎. 𝟐 − 𝟎. 𝟎𝟏 ∗ 𝟎. 𝟔𝟏𝟖 = −𝟎. 𝟐𝟎𝟔𝟏𝟖

𝑾𝟔𝒏𝒆𝒘 = 𝑾𝟔 − η ∗𝝏𝑬

𝝏𝑾𝟔= 𝟎. 𝟑 − 𝟎. 𝟎𝟏 ∗ 𝟎. 𝟎𝟗𝟕 = 𝟎. 𝟐𝟗𝟗𝟎𝟑

Continue updating weights according to derivatives and re-train the network until reaching an acceptable error.