解决numpy和torch数据类型转化的问题_Python

在实际计算过程中，float类型使用最多，因此这里重点介绍numpy和torch数据float类型转化遇到的问题，其他类型同理。

numpy数据类型转化

numpy使用astype转化数据类型，float默认转化为64位，可以使用np.float32指定为32位

				?

									#numpy转化float类型

									a= np.array([1,2,3])

									a = a.astype(np.float)

									print(a)

									print(a.dtype)

[1. 2. 3.]

float64

不要使用a.dtype指定数据类型，会使数据丢失

				?

									#numpy转化float类型

									b= np.array([1,2,3])

									b.dtype= np.float32

									print(b)

									print(b.dtype)

[1.e-45 3.e-45 4.e-45]

float32

不要用float代替np.float，否则可能出现意想不到的错误

不能从np.float64位转化np.float32，会报错

np.float64与np.float32相乘，结果为np.float64

在实际使用过程中，可以指定为np.float，也可以指定具体的位数，如np.float，不过直接指定np.float更方便。

torch数据类型转化

torch使用torch.float()转化数据类型，float默认转化为32位，torch中没有torch.float64()这个方法

				?

									# torch转化float类型

									b = torch.tensor([4,5,6])

									b = b.float()

									b.dtype

				?

									torch.float32

np.float64使用torch.from_numpy转化为torch后也是64位的

				?

									print(a.dtype)

									c = torch.from_numpy(a)

									c.dtype

float64

torch.float64

不要用float代替torch.float，否则可能出现意想不到的错误

torch.float32与torch.float64数据类型相乘会出错，因此相乘的时候注意指定或转化数据float具体类型

np和torch数据类型转化大体原理一样，只有相乘的时候，torch.float不一致不可相乘，np.float不一致可以相乘，并且转化为np.float64

numpy和tensor互转

tensor转化为numpy

				?

									import torch

									b = torch.tensor([4.0,6])

									# b = b.float()

									print(b.dtype)

									c = b.numpy()

									print(c.dtype)

torch.int64

int64

numpy转化为tensor

				?

									import torch

									import numpy as np

									b= np.array([1,2,3])

									# b = b.astype(np.float)

									print(b.dtype)

									c = torch.from_numpy(b)

									print(c.dtype)

int32

torch.int32

可以看到，torch默认int型是64位的，numpy默认int型是32位的

补充：torch.from_numpy VS torch.Tensor

最近在造dataset的时候，突然发现，在输入图像转tensor的时候，我可以用torch.Tensor直接强制转型将numpy类转成tensor类，也可以用torch.from_numpy这个方法将numpy类转换成tensor类，那么，torch.Tensor和torch.from_numpy这两个到底有什么区别呢？既然torch.Tensor能搞定，那torch.from_numpy留着不就是冗余吗？

答案

有区别，使用torch.from_numpy更加安全，使用tensor.Tensor在非float类型下会与预期不符。

解释

实际上，两者的区别是大大的。打个不完全正确的比方说，torch.Tensor就如同c的int，torch.from_numpy就如同c++的static_cast，我们都知道，如果将int64强制转int32，只要是高位转低位，一定会出现高位被抹去的隐患的，不仅仅可能会丢失精度，甚至会正负对调。

这里的torch.Tensor与torch.from_numpy也会存在同样的问题。

看看torch.Tensor的文档，里面清楚地说明了，

torch.Tensor is an alias for the default tensor type (torch.FloatTensor).

而torch.from_numpy的文档则是说明，

The returned tensor and ndarray share the same memory. Modifications to the tensor will be reflected in the ndarray and vice versa. The returned tensor is not resizable.

也即是说，

1、当转换的源是float类型，torch.Tensor与torch.from_numpy会共享一块内存！且转换后的结果的类型是torch.float32

2、当转换的源不是float类型，torch.Tensor得到的是torch.float32，而torch.from_numpy则是与源类型一致！

是不是很神奇，下面是一个简单的例子：

				?

									import torch

									import numpy as nps1 = np.arange(10, dtype=np.float32)

									s2 = np.arange(10) # 默认的dtype是int64# 例一

									o11 = torch.Tensor(s1)

									o12 = torch.from_numpy(s1)

									o11.dtype # torch.float32

									o12.dtype # torch.float32

									# 修改值

									o11[0] = 12

									o12[0] # tensor(12.)# 例二

									o21 = torch.Tensor(s2)

									o22 = torch.from_numpy(s2)

									o21.dtype # torch.float32

									o22.dtype # torch.int64

									# 修改值

									o21[0] = 12

									o22[0] # tensor(0)