打开APP
userphoto
未登录

开通VIP,畅享免费电子书等14项超值服

开通VIP
矿渣P104-100魔改8G,机器学习再就业
矿渣P104-100魔改8G,机器学习再就业
2020-04-20 20:30:00 17点赞 55收藏 28评论
购买理由
最近开始捣鼓TensorFlow ,训练时觉得CPU跑起来时间太长,手头只有A卡,先配置麻烦,于是就想买块N卡来跑。看了K40、K80等计算卡,参考nvidia官网GPU Compute Capability,最后因为垃圾佬的本性就在黄鱼捡了一块矿卡P104-100(号称是矿版1070,Compute Capability 6.1),矿卡本身是4G显存,可以刷Bios魔改8G显存,690元的价格觉得还能接受。就下单一张技嘉风扇版本,结果发来是微星的低算力卡,被JS坑了一把。嫌麻烦就含泪收下了。
外观展示
P104具体参数,
GPU-Z显示刷完bios后显存确实是8G了。这个P104据说也可以像P106一样操作来玩游戏,我用来跑机器学习就没有试,不过这些矿卡都是PCIe 1X,或许游戏带宽是问题吧。附上显卡bios,供有需要值友使用:8wx6。刷bios需谨慎。
安装tensorflow-gpu环境
我使用anaconda安装tensorflow-gpu,简单给大家介绍一下步骤
下载安装anaconda,安装时注意勾选add anaconda to my PATHenvironment variable
打开cmd,输入以下命令:
conda create -n tensorflow pip python=3.7 遇到y/n时都选择y
输入命令:activate tensorflow
使用国内的源,采用pip安装输入以下命令:
pip install --default-timeout=100 --ignore-installed --upgradetensorflow-gpu==2.0.1 -i https://pypi.tuna.tsinghua.edu.cn/simple
下载并安装cuda 10.0和cudnn。将cuDNN解压。将解压出来的三个文件夹下面的文件放到对应的CUDA相同文件夹下。安装cuda 10.1有些文件需要重命名。
并在 “我的电脑-管理-高级设置-环境变量”中找到path,添加以下环境变量(cuda使用默认安装路径):
C:Program FilesNVIDIA GPU Computing ToolkitCUDAv10.0bin
C:Program FilesNVIDIA GPU Computing ToolkitCUDAv10.0libnvvp
C:Program FilesNVIDIA GPU Computing ToolkitCUDAv10.0lib
C:Program FilesNVIDIA GPU Computing ToolkitCUDAv10.0include
验证安装结果
打开cmd,输入以下命令:
activatetensorflow
再输入:
python
再输入:
importtensorflow
没有异常抛出就证明安装成功了。
性能测试
因为我的机器没有核显,平时除了P104还得再插一张显卡。所以我又买了一块2070Super官网显示compute capability: 7.5。既然买了就跟这块compute capability: 6.1的矿卡P104PK一下吧!
2070Super具体参数
下面开始进行测试比较
统一运行环境win10 ,cuda 10.0,tensorflow-gpu2.1,Anaconda3-2020.02-Windows,Python3.7.7
1、先跑一下tensorflow 网站的“Hello World”
2070 SUPER
I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] CreatedTensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6283 MBmemory) -> physical GPU (device: 0, name: GeForce RTX 2070 SUPER, pci busid: 0000:65:00.0, compute capability: 7.5)
Train on 60000 samples
Epoch 1/5
60000/60000 [==============================] - 7s 117us/sample -loss: 0.2996 - accuracy: 0.9123
Epoch 2/5
60000/60000 [==============================] - 6s 99us/sample -loss: 0.1448 - accuracy: 0.9569
Epoch 3/5
60000/60000 [==============================] - 5s 85us/sample -loss: 0.1068 - accuracy: 0.9682
Epoch 4/5
60000/60000 [==============================] - 6s 101us/sample -loss: 0.0867 - accuracy: 0.9727
Epoch 5/5
60000/60000 [==============================] - 6s 96us/sample -loss: 0.0731 - accuracy: 0.9766
P104-100
I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] CreatedTensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 7482 MBmemory) -> physical GPU (device: 0, name: P104-100, pci bus id:0000:07:00.0, compute capability: 6.1)
Train on 60000 samples
Epoch 1/5
60000/60000 [==============================] - 4s 68us/sample -loss: 0.2957 - accuracy: 0.9143
Epoch 2/5
60000/60000 [==============================] - 3s 56us/sample -loss: 0.1445 - accuracy: 0.9569
Epoch 3/5
60000/60000 [==============================] - 3s 58us/sample -loss: 0.1087 - accuracy: 0.9668
Epoch 4/5
60000/60000 [==============================] - 3s 57us/sample -loss: 0.0898 - accuracy: 0.9720
Epoch 5/5
60000/60000 [==============================] - 3s 58us/sample -loss: 0.0751 - accuracy: 0.9764
P104运行时使用7482 MB memory,2070 SUPER 使用6283 MB memory都是8G卡,可能2070 SUPER需要同时负责显示画面,所以需保留些显存供使用。
对比测试一跑完我就哭了。P104每个Epoch用时3s 58,2070 SUPER每个Epoch用时几乎7s。P104比2070 SUPER快了几乎2S。我的2070 SUPER,白买了。我要去退掉。
2、接着跑Keras官方文档内的1DCNN for text classification
文档显示,本相测试对照耗时
90s/epochon Intel i5 2.4Ghz CPU.
10s/epoch on Tesla K40 GPU.
2070SUPER
Epoch 1/5
25000/25000 [==============================] - 10s 418us/step -loss: 0.4080 - accuracy: 0.7949 - val_loss: 0.3058 - val_accuracy: 0.8718
Epoch 2/5
25000/25000 [==============================] - 8s 338us/step - loss:0.2318 - accuracy: 0.9061 - val_loss: 0.2809 - val_accuracy: 0.8816
Epoch 3/5
25000/25000 [==============================] - 9s 349us/step - loss:0.1663 - accuracy: 0.9359 - val_loss: 0.2596 - val_accuracy: 0.8936
Epoch 4/5
25000/25000 [==============================] - 9s 341us/step - loss:0.1094 - accuracy: 0.9607 - val_loss: 0.3009 - val_accuracy: 0.8897
Epoch 5/5
25000/25000 [==============================] - 9s 341us/step - loss:0.0752 - accuracy: 0.9736 - val_loss: 0.3365 - val_accuracy: 0.8871
P104-100
Epoch 1/5
25000/25000 [==============================] - 8s 338us/step - loss:0.4059 - accuracy: 0.7972 - val_loss: 0.2898 - val_accuracy: 0.8772
Epoch 2/5
25000/25000 [==============================] - 7s 285us/step - loss:0.2372 - accuracy: 0.9038 - val_loss: 0.2625 - val_accuracy: 0.8896
Epoch 3/5
25000/25000 [==============================] - 7s 286us/step - loss:0.1665 - accuracy: 0.9357 - val_loss: 0.3274 - val_accuracy: 0.8701
Epoch 4/5
25000/25000 [==============================] - 7s 286us/step - loss:0.1142 - accuracy: 0.9591 - val_loss: 0.3090 - val_accuracy: 0.8854
Epoch 5/5
25000/25000 [==============================] - 7s 286us/step - loss:0.0728 - accuracy: 0.9747 - val_loss: 0.3560 - val_accuracy: 0.8843
还是矿卡P104最快,两卡都比TeslaK40快。
3、最后测试Train an Auxiliary Classifier GAN (ACGAN) on the MNIST dataset.
网页显示运行每epochs耗时,Hardware BackendTime/ Epoch
CPU TF 3hrs
Titan X (maxwell) TF 4min
Titan X(maxwell) TH 7 min
跑了5 epochs测试结果如下:
2070SUPER
Epoch 1/5
600/600 [==============================] - 45s 75ms/step
Testing for epoch 1:
component | loss| generation_loss | auxiliary_loss
-----------------------------------------------------------------
generator (train) | 0.76| 0.4153 | 0.3464
generator (test) | 1.16| 1.0505 | 0.1067
discriminator (train) | 0.68| 0.2566 | 0.4189
discriminator (test) | 0.74| 0.5961 | 0.1414
Epoch 2/5
600/600 [==============================] - 37s 62ms/step
Testing for epoch 2:
component | loss| generation_loss | auxiliary_loss
-----------------------------------------------------------------
generator (train) | 1.05| 0.9965 | 0.0501
generator (test) | 0.73| 0.7147 | 0.0117
discriminator (train) | 0.85| 0.6851 | 0.1644
discriminator (test) | 0.75| 0.6933 | 0.0553
Epoch 3/5
600/600 [==============================] - 38s 64ms/step
Testing for epoch 3:
component | loss| generation_loss | auxiliary_loss
-----------------------------------------------------------------
generator (train) | 0.84| 0.8246 | 0.0174
generator (test) | 0.67| 0.6645 | 0.0030
discriminator (train) | 0.82| 0.7042 | 0.1158
discriminator (test) | 0.77| 0.7279 | 0.0374
Epoch 4/5
600/600 [==============================] - 38s 63ms/step
Testing for epoch 4:
component | loss| generation_loss | auxiliary_loss
-----------------------------------------------------------------
generator (train) | 0.81| 0.7989 | 0.0107
generator (test) | 0.66| 0.6604 | 0.0026
discriminator (train) | 0.80| 0.7068 | 0.0938
discriminator (test) | 0.74| 0.7047 | 0.0303
Epoch 5/5
600/600 [==============================] - 38s 64ms/step
Testing for epoch 5:
component | loss| generation_loss | auxiliary_loss
-----------------------------------------------------------------
generator (train) | 0.80| 0.7890 | 0.0083
generator (test) | 0.64| 0.6388 | 0.0021
discriminator (train) | 0.79| 0.7049 | 0.0807
discriminator (test) | 0.73| 0.7056 | 0.0266
P104-100
Epoch 1/5
600/600 [==============================] - 63s 105ms/step
Testing for epoch 1:
component | loss| generation_loss | auxiliary_loss
-----------------------------------------------------------------
generator (train) | 0.79| 0.4320 | 0.3590
generator (test) | 0.88| 0.8000 | 0.0802
discriminator (train) | 0.68| 0.2604 | 0.4182
discriminator (test) | 0.72| 0.5822 | 0.1380
Epoch 2/5
600/600 [==============================] - 59s 98ms/step
Testing for epoch 2:
component | loss| generation_loss | auxiliary_loss
-----------------------------------------------------------------
generator (train) | 1.02| 0.9747 | 0.0450
generator (test) | 0.79| 0.7753 | 0.0165
discriminator (train) | 0.85| 0.6859 | 0.1629
discriminator (test) | 0.77| 0.7168 | 0.0576
Epoch 3/5
600/600 [==============================] - 59s 98ms/step
Testing for epoch 3:
component | loss| generation_loss | auxiliary_loss
-----------------------------------------------------------------
generator (train) | 0.84| 0.8263 | 0.0170
generator (test) | 0.64| 0.6360| 0.0042
discriminator (train) | 0.82| 0.7062 | 0.1157
discriminator (test) | 0.77| 0.7353 | 0.0384
Epoch 4/5
600/600 [==============================] - 58s 97ms/step
Testing for epoch 4:
component | loss| generation_loss | auxiliary_loss
-----------------------------------------------------------------
generator (train) | 0.82| 0.8036 | 0.0115
generator (test) | 0.69| 0.6850 | 0.0019
discriminator (train) | 0.80| 0.7054 | 0.0933
discriminator (test) | 0.75| 0.7165 | 0.0301
Epoch 5/5
600/600 [==============================] - 58s 97ms/step
Testing for epoch 5:
component | loss| generation_loss | auxiliary_loss
-----------------------------------------------------------------
generator (train) | 0.80| 0.7904 | 0.0087
generator (test) | 0.64| 0.6400 | 0.0028
discriminator (train) | 0.79| 0.7046 | 0.0806
discriminator (test) | 0.74| 0.7152 | 0.0272
这回2070SUPER终于耗时比P104-100少了。这张新卡暂时不用退了。
总结
如果个人学习使用,矿卡P104-100魔改8G版本性价比不错,可以购买。
本站仅提供存储服务,所有内容均由用户发布,如发现有害或侵权内容,请点击举报
打开APP,阅读全文并永久保存 查看更多类似文章
猜你喜欢
类似文章
【热】打开小程序,算一算2024你的财运
矿渣P104
面向普通开发者的机器学习入门
model.fit
DL之VGG16:基于VGG16(Keras)利用Knifey-Spoony数据集对网络架构进行迁移学习
tensorflow04——实战:鸢尾花iris数据集分类,神经网络_tensorflow鸢尾花数据集
使用tensorflow创建一个简单的神经网络
更多类似文章 >>
生活服务
热点新闻
分享 收藏 导长图 关注 下载文章
绑定账号成功
后续可登录账号畅享VIP特权!
如果VIP功能使用有故障,
可点击这里联系客服!

联系客服