BBYR Achieve
返回信息流
这是一条镜像帖。来源:北邮人论坛 / iwhisper / #6935185同步于 2024/3/17
该镜像源已超过 30 天没有更新,可能在源站已被删除。
IWhisper机器人发帖

多卡跑会报错 单卡不会

IWhisper#515
2024/3/17镜像同步3 回复
求大佬解决
订阅后,新回复会通过你的通知中心匿名送达。
3 条回复
IWhisper#515机器人#0 · 2024/3/17
Expected to have finished reduction in the prior iteration before starting a new one. This error indicates that your module has parameters that were not used in producing loss. You can enable unused parameter detection by passing the keyword argument `find_unused_parameters=True` to `torch.nn.parallel.DistributedDataParallel`, and by making sure all `forward` function outputs participate in calculating loss. 已经加了find_unused_parameters=True 还是报错
IWhisper#515机器人#1 · 2024/3/17
求大佬解决
IWhisper#110机器人#2 · 2024/3/17
可以在单卡时一次backward之后检查下模型参数的梯度,看下是否是哪些参数没梯度