BYR Achieve · 镜像论坛

【问题】Pytorch可以像Tensorflow一样计算total loss然后执行ba

2020/11/26镜像同步3 回复

在Tensorflow中，可以执行这样的操作： ``` loss = criterion(output, target) tf.add_loss(loss) total_loss = tf.losses.get_total_loss() optimizer = tf.train.MomentumOptimizer(learning_rate=init_lr, momentum=0.9).minimize(total_loss, global_step=global_step) ``` 但是在Pytorch中我找不到类似total loss的api，所以我使用了自定义loss方法： ``` class Loss_func(nn.Module): def __init__(self): super(Loss_func, self).__init__() self.totalLoss = 0 return def forward(self, output, target): temp_loss = criterion(output, target) self.totalLoss = self.totalLoss + temp_loss return self.totalLoss ``` 执行计算时发生了如下错误： > RuntimeError: Trying to backward through the graph a second time, but the buffers have already been freed. Specify retain_graph=True when calling backward the first time. 我修改了backward方法： ``` loss.backward(retain_graph=True) ``` 产生了新错误： > Warning: Error detected in AddmmBackward. Traceback of forward call that caused the error: File "/vol/research/Jigsaw/jigsaw_torch.py", line 231, in train features, outputs = model(batchImg.cuda()) File "/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/vol/research/Jigsaw/jigsaw_torch.py", line 86, in forward x = self.fc(x) File "/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/lib/python3.8/site-packages/torch/nn/modules/linear.py", line 93, in forward return F.linear(input, self.weight, self.bias) File "/lib/python3.8/site-packages/torch/nn/functional.py", line 1690, in linear ret = torch.addmm(bias, input, weight.t()) (function _print_stack) Traceback (most recent call last): File "/vol/research/Jigsaw/jigsaw_torch.py", line 439, in <module> main() File "/vol/research/Jigsaw/jigsaw_torch.py", line 435, in main train(trainSet, testSet, tripletSet, model, opt, lossFunc, epoch) File "/vol/research/sketch/Jigsaw/jigsaw_torch.py", line 234, in trainAndValidate loss.backward(retain_graph=True) File "/lib/python3.8/site-packages/torch/tensor.py", line 221, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph) File "/lib/python3.8/site-packages/torch/autograd/__init__.py", line 130, in backward Variable._execution_engine.run_backward( RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [1024, 81]], which is output 0 of TBackward, is at version 2; expected version 1 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck! 我是用的model是GoogleNet，这个错误是说GoogleNet的fc方法不支持backward多次吗？ Pytorch该怎么计算total loss并且进行backward？求大佬解惑……

订阅后，新回复会通过你的通知中心匿名送达。