BBYR Achieve
返回信息流
这是一条镜像帖。来源:北邮人论坛 / ml-dm / #37243同步于 2020/11/26
该镜像源已超过 30 天没有更新,可能在源站已被删除。
ML_DM机器人发帖

【问题】Pytorch可以像Tensorflow一样计算total loss然后执行ba

WinKawaks
2020/11/26镜像同步3 回复
在Tensorflow中,可以执行这样的操作: ``` loss = criterion(output, target) tf.add_loss(loss) total_loss = tf.losses.get_total_loss() optimizer = tf.train.MomentumOptimizer(learning_rate=init_lr, momentum=0.9).minimize(total_loss, global_step=global_step) ``` 但是在Pytorch中我找不到类似total loss的api,所以我使用了自定义loss方法: ``` class Loss_func(nn.Module): def __init__(self): super(Loss_func, self).__init__() self.totalLoss = 0 return def forward(self, output, target): temp_loss = criterion(output, target) self.totalLoss = self.totalLoss + temp_loss return self.totalLoss ``` 执行计算时发生了如下错误: > RuntimeError: Trying to backward through the graph a second time, but the buffers have already been freed. Specify retain_graph=True when calling backward the first time. 我修改了backward方法: ``` loss.backward(retain_graph=True) ``` 产生了新错误: > Warning: Error detected in AddmmBackward. Traceback of forward call that caused the error: File "/vol/research/Jigsaw/jigsaw_torch.py", line 231, in train features, outputs = model(batchImg.cuda()) File "/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/vol/research/Jigsaw/jigsaw_torch.py", line 86, in forward x = self.fc(x) File "/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/lib/python3.8/site-packages/torch/nn/modules/linear.py", line 93, in forward return F.linear(input, self.weight, self.bias) File "/lib/python3.8/site-packages/torch/nn/functional.py", line 1690, in linear ret = torch.addmm(bias, input, weight.t()) (function _print_stack) Traceback (most recent call last): File "/vol/research/Jigsaw/jigsaw_torch.py", line 439, in <module> main() File "/vol/research/Jigsaw/jigsaw_torch.py", line 435, in main train(trainSet, testSet, tripletSet, model, opt, lossFunc, epoch) File "/vol/research/sketch/Jigsaw/jigsaw_torch.py", line 234, in trainAndValidate loss.backward(retain_graph=True) File "/lib/python3.8/site-packages/torch/tensor.py", line 221, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph) File "/lib/python3.8/site-packages/torch/autograd/__init__.py", line 130, in backward Variable._execution_engine.run_backward( RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [1024, 81]], which is output 0 of TBackward, is at version 2; expected version 1 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck! 我是用的model是GoogleNet,这个错误是说GoogleNet的fc方法不支持backward多次吗? Pytorch该怎么计算total loss并且进行backward?求大佬解惑……
订阅后,新回复会通过你的通知中心匿名送达。
3 条回复
jiang1995机器人#1 · 2020/11/27
你这个total loss是指有多个损失函数,所以想累加多个损失函数的总损失back吗? 如果是这样的话,就调各个loss函数计算出结果直接累加就成。
WinKawaks机器人#2 · 2020/11/27
我不确定tf里面的totalloss是把每个batch的loss累加后执行一次backward还是执行了多次backward…… 【 在 jiang1995 的大作中提到: 】 : 你这个total loss是指有多个损失函数,所以想累加多个损失函数的总损失back吗? 如果是这样的话,就调各个loss函数计算出结果直接累加就成。
c654528593机器人#3 · 2020/11/28
意思是累计多次batch的数据 然后一次backward, 那写代码的时候注意 batch时累计下损失,不要写zero_grad 然后就可以计算了