Understand the architecture that underpins today’s most powerful AI models.
Transformers are the superpower behind large language models (LLMs) like ChatGPT, Gemini, and Claude.
Transformers in Action gives you the insights, practical techniques, and extensive code samples you need to adapt pretrained transformer models to new and exciting tasks.
Inside
Transformers in Action you’ll learn:
- How transformers and LLMs work
- Modeling families and architecture variants
- Efficient and specialized large language models
- Adapt HuggingFace models to new tasks
- Automate hyperparameter search with Ray Tune and Optuna
- Optimize LLM model performance
- Advanced prompting and zero/few-shot learning
- Text generation with reinforcement learning
- Responsible LLMs
Transformers in Action takes you from the origins of transformers all the way to fine-tuning an LLM for your own projects. Author
Nicole Koenigstein demonstrates the vital mathematical and theoretical background of the transformer architecture practically through executable Jupyter notebooks. You’ll discover advice on prompt engineering, as well as proven-and-tested methods for optimizing and tuning large language models. Plus, you’ll find unique coverage of AI ethics, specialized smaller models, and the decoder encoder architecture.