diff --git a/AI/Generative AI & LLM's/Introduction LLM & lifecycle.md b/AI/Generative AI & LLM's/Introduction LLM & lifecycle.md new file mode 100644 index 0000000..c92ec75 --- /dev/null +++ b/AI/Generative AI & LLM's/Introduction LLM & lifecycle.md @@ -0,0 +1,100 @@ +# Generative AI & LLMs + +## Introduction + +Large language models, their use cases, how the models work, prompt engineering, how to make creative text outputs, and outline a project lifecycle for generative AI projects. + +Generative AI is a subset of traditional machine learning. +And the machine learning models that underpin generative AI have +learned these abilities by finding statistical patterns in massive +datasets of content that was originally generated by humans. + +Foundation models, sometimes called base models. Examples are GTP, BERT, LLaMa, BLOOM, FLAN-T5 and PaLM + +The more **parameters** a model has, the more memory, and +as it turns out, the more sophisticated the tasks it can perform. + +![Prompt and completion](images/2024-03-02-17-51-00-image.png) + +The text that you pass to an LLM is known as a **prompt**. +The space or memory that is available to the prompt is called the **context window**, +and this is typically large enough for a few thousand words, but +differs from model to model. The output of the model is called a **completion**, and the act of using the model to generate text is known as **inference**. + +## Capabilities of LLMs + +- next word prediction + +- translation tasks + +- program code generation + +- information retrieval: ask the model to identify all of + + the people and places identified in a news article => named **entity recognition**, a word classification. + +## Transformer architecture + +This novel approach unlocked the progress in generative AI that we see today. It can be **scaled efficiently** to use multi-core GPUs, it can **parallel process input data**, making use of much larger training datasets, and crucially, it's able to learn **to pay attention to the meaning of the words it's processing**. + +[Paper: Transformers: Attention is all you need.](https://arxiv.org/pdf/1706.03762.pdf) + +The power of the transformer architecture lies in its ability to learn the relevance and context of all of the words in a sentence. To apply attention weights to those relationships so that the model learns the relevance of each word to each other words no matter where they are in the input. + +Attention map and can be useful to illustrate the attention weights between + +each word and every other word + +![](images/2024-03-02-19-11-59-Screenshot%20from%202024-03-02%2019-11-15.png) + +Words are strongly connected to other words (orange lines) is called **Self-attention** and the ability to learn a tension in this way across the whole input significantly approves the model's ability to encode language. + +![](images/2024-03-02-19-16-03-Screenshot%20from%202024-03-02%2019-15-56.png) + +The transformer architecture is split into two distinct parts, the **encoder** and the **decoder**. These components work in conjunction with each other and they share a number of similarities. + +![](images/2024-03-02-19-37-12-Screenshot%20from%202024-03-02%2019-37-04.png) + +first tokenize the words + +![](images/2024-03-02-19-20-33-Screenshot%20from%202024-03-02%2019-20-17.png) + +Multiple tokenization methods, for example: + +- token IDs matching two complete words, + +- using token IDs to represent parts of words. + +_Important is that once you've selected a tokenizer to train the model, +you must use the same tokenizer when you generate text_ + +### Embedding layer + +This layer is a **trainable vector embedding space**, a high-dimensional space where +each token is represented as a vector and occupies a unique location within that space. +Each token ID in the vocabulary is matched to a multi-dimensional vector, and the intuition is that these vectors learn to encode the meaning and context of individual tokens in the input sequence. Word2vec use this concept. + +![](images/2024-03-02-19-29-55-Screenshot%20from%202024-03-02%2019-29-45.png) + +Each word has been matched to a token ID, and each token is mapped into a vector. + +Adding the token vectors into the base of the encoder or the decoder, +then the positional encoding is also added. The model processes each of the input tokens in parallel. So by adding the **positional encoding**, you preserve the information +about the word order and don't lose the **relevance of the position of the word in the sentence**. Once you've summed the input tokens and the positional encodings, +you pass the resulting vectors to the **self-attention layer**. + +![](images/2024-03-02-19-34-05-Screenshot%20from%202024-03-02%2019-33-50.png) + +The transformer architecture actually has **multi-headed self-attention**. +This means that multiple sets of self-attention weights or heads are learned in parallel independently of each other. The number of attention heads included in the attention layer varies from model to model, but numbers in the range of 12-100 are common. +The intuition here is that each self-attention head will learn a different aspect of language. + +It's important to note that you don't dictate ahead of time what aspects of language the attention heads will learn. The weights of each head are randomly initialized and given sufficient training data and time, each will learn different aspects of language. + +Now that all of the attention weights have been applied to your input data, the output is processed through a fully-connected feed-forward network. The output of this layer is a vector of logits proportional to the probability score for each and every token in the tokenizer dictionary. You can then pass these logits to a final softmax layer, where they are normalized into a probability score for each word. + +One single token will have a score higher than the rest, but there are a number of methods that you can use to vary the final selection from this vector of probabilities. + +[Video Transformer Architecture](Transformers architecture.mp4)Transformers architecture.mp4 + +[Video Transformer Architecture](images/TransformersArchitecture.mp4) diff --git a/AI/Generative AI & LLM's/images/2024-03-02-17-51-00-image.png b/AI/Generative AI & LLM's/images/2024-03-02-17-51-00-image.png new file mode 100644 index 0000000..772fa1f Binary files /dev/null and b/AI/Generative AI & LLM's/images/2024-03-02-17-51-00-image.png differ diff --git a/AI/Generative AI & LLM's/images/2024-03-02-19-10-15-image.png b/AI/Generative AI & LLM's/images/2024-03-02-19-10-15-image.png new file mode 100644 index 0000000..4368d3d Binary files /dev/null and b/AI/Generative AI & LLM's/images/2024-03-02-19-10-15-image.png differ diff --git a/AI/Generative AI & LLM's/images/2024-03-02-19-11-59-Screenshot from 2024-03-02 19-11-15.png b/AI/Generative AI & LLM's/images/2024-03-02-19-11-59-Screenshot from 2024-03-02 19-11-15.png new file mode 100644 index 0000000..4015101 Binary files /dev/null and b/AI/Generative AI & LLM's/images/2024-03-02-19-11-59-Screenshot from 2024-03-02 19-11-15.png differ diff --git a/AI/Generative AI & LLM's/images/2024-03-02-19-16-03-Screenshot from 2024-03-02 19-15-56.png b/AI/Generative AI & LLM's/images/2024-03-02-19-16-03-Screenshot from 2024-03-02 19-15-56.png new file mode 100644 index 0000000..6b7ab6b Binary files /dev/null and b/AI/Generative AI & LLM's/images/2024-03-02-19-16-03-Screenshot from 2024-03-02 19-15-56.png differ diff --git a/AI/Generative AI & LLM's/images/2024-03-02-19-17-28-Screenshot from 2024-03-02 19-17-01.png b/AI/Generative AI & LLM's/images/2024-03-02-19-17-28-Screenshot from 2024-03-02 19-17-01.png new file mode 100644 index 0000000..a5e302d Binary files /dev/null and b/AI/Generative AI & LLM's/images/2024-03-02-19-17-28-Screenshot from 2024-03-02 19-17-01.png differ diff --git a/AI/Generative AI & LLM's/images/2024-03-02-19-20-33-Screenshot from 2024-03-02 19-20-17.png b/AI/Generative AI & LLM's/images/2024-03-02-19-20-33-Screenshot from 2024-03-02 19-20-17.png new file mode 100644 index 0000000..7b31c10 Binary files /dev/null and b/AI/Generative AI & LLM's/images/2024-03-02-19-20-33-Screenshot from 2024-03-02 19-20-17.png differ diff --git a/AI/Generative AI & LLM's/images/2024-03-02-19-29-55-Screenshot from 2024-03-02 19-29-45.png b/AI/Generative AI & LLM's/images/2024-03-02-19-29-55-Screenshot from 2024-03-02 19-29-45.png new file mode 100644 index 0000000..9c3c1f6 Binary files /dev/null and b/AI/Generative AI & LLM's/images/2024-03-02-19-29-55-Screenshot from 2024-03-02 19-29-45.png differ diff --git a/AI/Generative AI & LLM's/images/2024-03-02-19-34-05-Screenshot from 2024-03-02 19-33-50.png b/AI/Generative AI & LLM's/images/2024-03-02-19-34-05-Screenshot from 2024-03-02 19-33-50.png new file mode 100644 index 0000000..44ea574 Binary files /dev/null and b/AI/Generative AI & LLM's/images/2024-03-02-19-34-05-Screenshot from 2024-03-02 19-33-50.png differ diff --git a/AI/Generative AI & LLM's/images/2024-03-02-19-36-00-Screenshot from 2024-03-02 19-35-25.png b/AI/Generative AI & LLM's/images/2024-03-02-19-36-00-Screenshot from 2024-03-02 19-35-25.png new file mode 100644 index 0000000..94248c7 Binary files /dev/null and b/AI/Generative AI & LLM's/images/2024-03-02-19-36-00-Screenshot from 2024-03-02 19-35-25.png differ diff --git a/AI/Generative AI & LLM's/images/2024-03-02-19-37-12-Screenshot from 2024-03-02 19-37-04.png b/AI/Generative AI & LLM's/images/2024-03-02-19-37-12-Screenshot from 2024-03-02 19-37-04.png new file mode 100644 index 0000000..8a4b1bd Binary files /dev/null and b/AI/Generative AI & LLM's/images/2024-03-02-19-37-12-Screenshot from 2024-03-02 19-37-04.png differ diff --git a/AI/Generative AI & LLM's/images/Transformers architecture.mp4 b/AI/Generative AI & LLM's/images/Transformers architecture.mp4 new file mode 100644 index 0000000..bad1ff2 Binary files /dev/null and b/AI/Generative AI & LLM's/images/Transformers architecture.mp4 differ diff --git a/AI/Pytorch/CodeLines.md b/AI/Pytorch/CodeLines.md index 7b448cf..4e52b4c 100644 --- a/AI/Pytorch/CodeLines.md +++ b/AI/Pytorch/CodeLines.md @@ -8,6 +8,18 @@ This container runs in background docker run -d -it --gpus all -name pytorch-container pytorch/pytorch:latest ``` +## Jupyter container with pytorch for GPU +```bash +docker run -e JUPYTER_ENABLE_LAB=yes -v /home/john/Work/pytorch/:/workspace/dev --gpus all --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 --rm -p 8888:8888 jw/pytorch:0.1 +``` + +## Test GPU +```python +import torch +print(torch.cuda.is_available()) +``` + + ## Connect to running container ```bash diff --git a/Overig/Docker/commands.md b/Overig/Docker/commands.md index 6d472ce..f0ff246 100644 --- a/Overig/Docker/commands.md +++ b/Overig/Docker/commands.md @@ -7,6 +7,13 @@ docker run -d -it --gpus all -name pytorch-container pytorch/pytorch:latest docker run -d -it --rm -v $(pwd):/src --gpus all -name pytorch-container pytorch/pytorch:latest ``` +## Test GPU + +```python +>>> import torch +>>> print(torch.cuda.is_available()) +``` + The latter removes the docker instance when stopped and alse has a volume ## Connect to running container diff --git a/Overig/Linux/Keychron Keyboard.md b/Overig/Linux/Keychron Keyboard.md index 59bc5af..c3acb61 100644 --- a/Overig/Linux/Keychron Keyboard.md +++ b/Overig/Linux/Keychron Keyboard.md @@ -4,4 +4,6 @@ updated: 2022-04-27 17:37:12Z created: 2022-04-27 17:36:57Z --- -https://gist.github.com/andrebrait/961cefe730f4a2c41f57911e6195e444 \ No newline at end of file +https://gist.github.com/andrebrait/961cefe730f4a2c41f57911e6195e444 + +[/dev/schnouki – How to use a Keychron K2/K4 USB keyboard on Linux](https://schnouki.net/post/2019/how-to-use-a-keychron-k2-usb-keyboard-on-linux/) \ No newline at end of file