Found dtype Double but expected Float
在参与计算的时候两个参与loss计算的值要一样啊
y_pred = model(train_x,batch_size)
# 训练过程中,正向传播生成网络的输出,计算输出和实际值之间的损失值
# 在参与计算的时候 两个loss参与值的类型要一样啊
y_pred= y_pred.cpu().float()
train_y=train_y.float()
single_loss = loss_function(y_pred,train_y)
single_loss.backward() # 调用backward()自动生成梯度
optimizer.step() # 使用optimizer.step()执行优化器,把梯度传播回每个网络
错误:ValueError: Shapes (6, 1) and (6, 20) are incompatible
解析:
如果y是one-hot encoding格式,使用sparse_categorical_crossentropy
[1,0,0]
[0,1,0]
[0,0,1]
如果y是整数,非one-hot encoding格式,使用categorical_crossentropy
1
2
3
报错:errors_impl.UnknownError: 2 root error(s) found.
tensorflow.python.framework.errors_impl.UnknownError: 2 root error(s) found.
(0) Unknown: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
[[node SegNet/block1_conv2/Relu (defined at media/ac/ubuntu train/Semantic-Segmentation-main/train.py:337) ]]
[[confusion_matrix/assert_less_1/Assert/AssertGuard/pivot_f/_31/_77]]
(1) Unknown: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
[[node SegNet/block1_conv2/Relu (defined at media/ac/ubuntu train/Semantic-Segmentation-main/train.py:337) ]]
0 successful operations.
0 derived errors ignored. [Op:__inference_train_function_6169]
原因分析:可能是显存太小
按需分配现存
tf 2.x与以前版本的差距是:
tf 2.x:tensorflow.compat.v1
sess =tf.compat.v1.Session(config=config)
from tensorflow.compat.v1 import ConfigProto
from tensorflow.compat.v1 import InteractiveSession
按需分配现存示例:
from tensorflow.compat.v1 import ConfigProto
from tensorflow.compat.v1 import InteractiveSession
config = ConfigProto()
config.gpu_options.allow_growth = True
session = InteractiveSession(config=config)
with tf.compat.v1.Session(config=config) as sess:
# 输入图片为256x256,2个分类
shape, classes = (224, 224, 3), 20
# 调用keras的ResNet50模型
model = keras.applications.resnet50.ResNet50(input_shape = shape, weights=None, classes=classes)
model.compile(optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"])
# 训练模型 categorical_crossentropy sparse_categorical_crossentropy
# training = model.fit(train_x, train_y, epochs=50, batch_size=10)
model.fit(train_x,train_y,validation_data=(test_x, test_y), epochs=20, batch_size=6,verbose=2)
# # 把训练好的模型保存到文件
model.save('resnet_model_dog_n_face.h5')
Expected object of scalar type Long but got scalar type Float for argument
这样可以避免出错:x=torch.Tensor(x.numpy()).float().to(device) # .float() .numpy()
x=torch.Tensor(x.numpy()).to(device)
模型不能在gpu上跑 几点原因:模型内部所以产生的变量都要以tensor的形式放在 gpu上 .to(device)
第二点:模型 损失函数等要在gpu上
model = LSTM().to(device)
loss_function = nn.MSELoss().to(device)
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)#建立优化器实例
print(model)
pycharm报错:Process finished with exit code -1073741819 (0xC0000005)解决办法
在模型种 tensor 不要重复的 使用 .to(device) 和torch.Tensor(arr)
模型保存和加载出现错误: