Take a deep dive into Transformers and Large Language Models—the foundations of generative AI!
Generative AI has set up shop in almost every aspect of business and society. Transformers and Large Language Models (LLMs) now power everything from code creation tools like Copilot and Cursor to AI agents, live language translators, smart chatbots, text generators, and much more.
In
Transformers and LLMs in Action you’ll discover:
- How transformers and LLMs work under the hood
- Adapting AI models to new tasks
- Optimizing LLM model performance
- Text generation with reinforcement learning
- Multi-modal AI models
- Encoder-only, decoder-only, encoder-decoder, and small language models
This practical book gives you the background, mental models, and practical skills you need to put Gen AI to work.
What is a transformer?
A “transformer” is a neural network model that finds relationships in sequences of words or other data using a mathematical technique called attention. Because the attention mechanism allows transformers to focus on the most relevant parts of a sequence, transformers can learn context and meaning from even large bodies of text. LLMs like GPT, Gemini, and Claude, are transformer-based models that have been trained on massive data sets, which gives them the uncanny ability to generate natural, coherent responses across a wide range of knowledge domains.