deep-learningpytorchgradienttensorderivative

Tensor Operation and gradient


I was going some tutorials on youtube where below code sample was used to explain derivatives

Create tensors.

    x = torch.tensor(3.)

    w = torch.tensor(4., requires_grad=True)

    b = torch.tensor(5., requires_grad=True)

    x, w, b

Arithmetic operations

    y = w * x + b

    y

Compute derivatives

    y.backward()

Display gradients

    print('dy/dx:', x.grad)

    print('dy/dw:', w.grad)

    print('dy/db:', b.grad)

OUTPUT

dy/dx: None

dy/dw: tensor(3.)

dy/db: tensor(1.)

Could anyone please explain me how we are getting tensor(3.) & tensor(1.) as an output of gradient. I need to understand how pytorch is performing this operation behind the scene.

Any help would be appreciated.


Solution

  • You have y = w*x + b, then

    dy/dx = w
    dy/dw = x
    dy/db = 1
    

    Since you've not set requires_grad=True for x, PyTorch won't calculate derivative w.r.t. it.
    Hence, dy/dx = None

    Rest are the values of corresponding tensors. Thus, the final output is

    dy/dx: None
    dy/dw: tensor(3.)
    dy/db: tensor(1.)