Found dtype Double but expected Float

在参与计算的时候两个参与loss计算的值要一样啊

        y_pred = model(train_x,batch_size)
# 训练过程中,正向传播生成网络的输出,计算输出和实际值之间的损失值
# 在参与计算的时候 两个loss参与值的类型要一样啊
y_pred= y_pred.cpu().float()
train_y=train_y.float()
single_loss = loss_function(y_pred,train_y)
single_loss.backward() # 调用backward()自动生成梯度
optimizer.step() # 使用optimizer.step()执行优化器,把梯度传播回每个网络

错误:ValueError: Shapes (6, 1) and (6, 20) are incompatible

python常见错误类型AttributeError: ‘Network‘ object has no attribute ‘copy‘_python

解析:

如果y是one-hot encoding格式,使用sparse_categorical_crossentropy

[1,0,0]
[0,1,0]
[0,0,1]

如果y是整数,非one-hot encoding格式,使用categorical_crossentropy

1
2
3

 python常见错误类型AttributeError: ‘Network‘ object has no attribute ‘copy‘_tensorflow_02

报错:errors_impl.UnknownError: 2 root error(s) found.

tensorflow.python.framework.errors_impl.UnknownError: 2 root error(s) found.

  (0) Unknown:  Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.

     [[node SegNet/block1_conv2/Relu (defined at media/ac/ubuntu train/Semantic-Segmentation-main/train.py:337) ]]

     [[confusion_matrix/assert_less_1/Assert/AssertGuard/pivot_f/_31/_77]]

  (1) Unknown:  Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.

     [[node SegNet/block1_conv2/Relu (defined at media/ac/ubuntu train/Semantic-Segmentation-main/train.py:337) ]]

0 successful operations.

0 derived errors ignored. [Op:__inference_train_function_6169]

原因分析:可能是显存太小

按需分配现存

tf 2.x与以前版本的差距是:

tf 2.x:tensorflow.compat.v1 

sess =tf.compat.v1.Session(config=config) 
from tensorflow.compat.v1 import ConfigProto
from tensorflow.compat.v1 import InteractiveSession

 按需分配现存示例:

from tensorflow.compat.v1 import ConfigProto
from tensorflow.compat.v1 import InteractiveSession
config = ConfigProto()
config.gpu_options.allow_growth = True
session = InteractiveSession(config=config)
with tf.compat.v1.Session(config=config) as sess:
# 输入图片为256x256,2个分类
shape, classes = (224, 224, 3), 20
# 调用keras的ResNet50模型
model = keras.applications.resnet50.ResNet50(input_shape = shape, weights=None, classes=classes)
model.compile(optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"])

# 训练模型 categorical_crossentropy sparse_categorical_crossentropy
# training = model.fit(train_x, train_y, epochs=50, batch_size=10)
model.fit(train_x,train_y,validation_data=(test_x, test_y), epochs=20, batch_size=6,verbose=2)
# # 把训练好的模型保存到文件
model.save('resnet_model_dog_n_face.h5')

Expected object of scalar type Long but got scalar type Float for argument 

python常见错误类型AttributeError: ‘Network‘ object has no attribute ‘copy‘_scala_03

这样可以避免出错:x=torch.Tensor(x.numpy()).float().to(device) # .float() .numpy()

                                x=torch.Tensor(x.numpy()).to(device)

模型不能在gpu上跑 几点原因:

模型内部所以产生的变量都要以tensor的形式放在 gpu上 .to(device)

第二点:模型 损失函数等要在gpu上

model = LSTM().to(device)
loss_function = nn.MSELoss().to(device)
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)#建立优化器实例
print(model)

pycharm报错:Process finished with exit code -1073741819 (0xC0000005)解决办法

 python常见错误类型AttributeError: ‘Network‘ object has no attribute ‘copy‘_scala_04

 在模型种 tensor 不要重复的 使用 .to(device) 和torch.Tensor(arr)

python常见错误类型AttributeError: ‘Network‘ object has no attribute ‘copy‘_tensorflow_05

模型保存和加载出现错误: