[大語言模型訓練] 原來 cuda 12 不是一般人能夠使用的...

最近有看到一些 LLA 複合大語言模型想要嘗試

發現甚至要求到 cuda 12

然後搜尋一下文章

https://pixinsight.com/forum/index.php?threads/cuda-v11-8-vs-cuda-v12-2.21431/

重點有這句話:

Tensorflow stopped native support for Windows with a version that was built on CUDA 11.2, even the latest version for linux/macos/wsl was built on CUDA 11.8.
Tensorflow 停止了對 Windows 的本機支持，其基於 CUDA 11.2 構建的版本，甚至 linux/macos/wsl 的最新版本也是基於 CUDA 11.8 構建的。

也就是說我在windows上開發就會受到這個限制

但我確實只能在公司的 windows 上開發阿...

然後上面還有一句話

Nvidia recommends staying on 11.8 anyway. Unless you have the $30K H100 GPU of course

無論如何，Nvidia 建議保留 11.8。當然，除非你有 3 萬美元的 H100 GPU

也就是比較通用的版本就是 CUDA11

例如我目前嘗試的專案有用到 flash-attention 就沒辦法用了

https://github.com/Dao-AILab/flash-attention

Requirements: H100 / H800 GPU, CUDA >= 12.3.

也不是說沒錢的問題
而是沒有有這個必要

不一定需要訓練自己的語言模型阿.

2024/10/25
程式

Zoearth Joomla Site
台中阿任的Joomla網站

[大語言模型訓練] 原來 cuda 12 不是一般人能夠使用的...

日期:2024/10/25

留言板

關於我

巴哈姆特

Count

Zoearth Joomla Site台中阿任的Joomla網站

[大語言模型訓練] 原來 cuda 12 不是一般人能夠使用的...

日期:2024/10/25

關於我

巴哈姆特

Count

Zoearth Joomla Site
台中阿任的Joomla網站