1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
grads = {} def save_grad(name): def hook(grad): grads[name] = grad return hook x = Variable(torch.randn(1,1), requires_grad=True) y = 3*x z = y**2 # In here, save_grad('y') returns a hook (a function) that keeps 'y' as name y.register_hook(save_grad('y')) z.register_hook(save_grad('z')) z.backward() print(grads['y']) print(grads['z']) |
Thanks to Adam Paszke’s post in Pytorch Discussion
I struggled with a problem today: My parameter “b” is not updating in the following code:
1 2 |
b = nn.Parameter(torch.ones(batch_size, 1) a = torch.norm(b, dim=1) |
There’s nothing wrong with the gradient of “a”. So what’s the problem?
The problem is: I used the wrong initialization of “b”. I init “b” with all zeros. and the gradient of the norm of an all-zero vector is always zero.