fix exponential

This commit is contained in:
John 2024-03-17 12:01:30 +01:00
parent 1a5e9f0a4e
commit d1d5d4d6ae
1 changed files with 5 additions and 4 deletions

View File

@ -250,12 +250,13 @@ Model capability with size has driven the development of larger and larger model
## Computational challenges of training LLMs
Most common issues: OutOfMemoryError: CUDA out of memory.
CUDA = Compute Unified Device Architecture
Weigths:
- 1 parameter = 4 bytes (32 float)
- 1B parameters = 4 x 10^9 bytes = 4GB
- 1B parameters = 4 x $10^9$ bytes = 4GB
In general: GPU memory needed tot train 1B parameters is 6 times model size = 24GB
@ -265,13 +266,13 @@ In general: GPU memory needed tot train 1B parameters is 6 times model size = 24
The downside is that BF16 is not well suited for integer calculations, but these are relatively rare in deep learning.
![quantization Summary](images/quantizationSummary.png.png)
![Quantization Summary](images/quantizationSummary.png)
So, for full precision model of 4GB @ 32-bit full precision -> 16 bit quantized 2GB @ 16bit half precison -> 8 bit quantized model 1GB @ 8bit precision
![GPU RAM needed for larger models](images/GPURAMbneeded.png)
[video Computational challenges of training LLMs ](images/ComputationalChallengesOfTrainingLLMs.mp4)
[Video Computational challenges of training LLMs ](images/ComputationalChallengesOfTrainingLLMs.mp4)
[Efficient multi-GPU compute strategies](images/Efficientmulti-GPUcomputestrategies.mp4)
[Video Efficient multi-GPU compute strategies](images/Efficientmulti-GPUcomputestrategies.mp4)