Transformers, the tech behind LLMs | Deep Learning Chapter 5

Breaking down how Large Language Models work, visualizing how data flows through. Instead of sponsored ad reads, these lessons are funded directly by viewers: https://3b1b.co/support --- Here are a few other relevant resources Build a GPT from scratch, by Andrej Karpathy https://youtu.be/kCc8FmEb1nY If you want a conceptual understanding of language models from the ground up, @vcubingx just started a short series of videos on the topic: https://youtu.be/1il-s4mgNdI?si=XaVxj6bsdy3VkgEX If you're interested in the herculean task of interpreting what these large networks might actually be doing, the Transformer Circuits posts by Anthropic are great. In particular, it was only after reading one of these that I started thinking of the combination of the value and output matrices as being a combined low-rank map from the embedding space to itself, which, at least in my mind, made things much clearer than other sources. https://transformer-circuits.pub/2021/framework/index.html History of language models by Brit Cruise, @ArtOfTheProblem https://youtu.be/OFS90-FX6pg An early paper on how directions in embedding spaces have meaning: https://arxiv.org/pdf/1301.3781.pdf Звуковая дорожка на русском языке: Влад Бурмистров. --- Timestamps 0:00 - Predict, sample, repeat 3:03 - Inside a transformer 6:36 - Chapter layout 7:20 - The premise of Deep Learning 12:27 - Word embeddings 18:25 - Embeddings beyond words 20:22 - Unembedding 22:22 - Softmax with temperature 26:03 - Up next

7.8M views

194.8K likes

Language

Format

Options

Skip Sponsors

Transcript

Italian

4658 words

29186 chars

24 min read

Le iniziali GPT stanno per Generative Pretrained Transformer. La prima parola è abbastanza semplice: si tratta di bot che generano nuovo testo. Il termine "preaddestrato" si riferisce al fatto che il modello è stato sottoposto a un processo di apprendimento da un'enorme quantità di dati, il prefisso indica che c'è più spazio per perfezionarlo su compiti specifici con un addestramento aggiuntivo. Ma l'ultima parola è il vero pezzo chiave. Un transformer è un tipo specifico di rete neurale, un modello di apprendimento automatico, ed è l'invenzione principale alla base dell'attuale boom dell'IA. L'obiettivo di questo video e dei capitoli successivi è quello di spiegare visivamente cosa succede all'interno di un transformer. Seguiremo i dati che lo attraversano e procederemo passo dopo passo. Esistono diversi tipi di modelli che puoi costruire utilizzando i transformer. Alcuni modelli ricevono l'audio e producono una trascrizione. Questa frase proviene da un modello che fa il percorso inverso, producendo un discorso sintetico solo a partire dal testo. Tutti quegli strumenti che hanno conquistato il mondo nel 2022, come DALL-E e Midjourney, che accettano una descrizione testuale e producono un'immagine, sono basati sui transformer. Anche se non riesco a fargli capire cosa dovrebbe essere una creatura a forma pi greco, sono comunque stupito che questo genere di cose sia anche solo lontanamente possibile. E il transformer originale introdotto nel 2017 da Google è stato inventato per lo specifico caso d'uso della traduzione di testi da una lingua all'altra. Ma la variante su cui ci concentreremo io e te, che è quella che sta alla base di strumenti come ChatGPT, sarà un modello addestrato a recepire un pezzo di testo, magari con alcune immagini o suoni circostanti che lo accompagnano, e a produrre una previsione su ciò che viene dopo nel testo. Questa previsione assume la forma di una distribuzione di probabilità su diversi pezzi di testo che potrebbero seguire....

More YouTube Tools

YouTube Video Tools

Free tools for YouTube video analysis

Get Another Transcript

Extract transcripts from any YouTube video

💡 Pro Tips for YouTube Transcripts

• Use transcripts to create study notes from educational videos
• Extract quotes for social media or research
• Convert video content to searchable text
• Create subtitles for accessibility

Transformers, the tech behind LLMs | Deep Learning Chapter 5

3Blue1Brown

7.8M views

194.8K likes

More YouTube Tools

💡 Pro Tips for YouTube Transcripts

• Use transcripts to create study notes from educational videos
• Extract quotes for social media or research
• Convert video content to searchable text
• Create subtitles for accessibility