Manning Early Access Program (MEAP) Read chapters as they are written, get the finished eBook as soon as it’s ready, and receive the pBook long before it's in bookstores.

8 of 12 chapters available

video summary first chapter summary

Resources

Book forum more

Become a
Reviewer

Help us create great books

Generative AI in Computer Vision you own this product

GANs, diffusion models, and transformers

Vladimir Bok

MEAP began September 2024
Publication in Summer 2025 (estimated)

ISBN 9781633437449
350 pages (estimated)

Included with a Manning Online subscription

printed in black & white

Python
Data

read now

pro $24.99 per month

access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
choose one free eBook per month to keep
exclusive 50% discount on all purchases

lite $19.99 per month

access to all Manning books, including MEAPs!

team

5, 10 or 20 seats+ for your team - learn more

eBook

$47.99 $26.39

you save $21.60 (45%)

Look inside

Master the essential models, algorithms, tools, and techniques for interpreting and generating images using AI.

From digital special effects to medical image augmentation and analysis, generative AI is revolutionizing the way we create and interpret visual information. Innovations including diffusion models, text-to-image generators, GANs, and more help you create photorealistic graphics, empower creative expression with text-to-image tools, and accurately recognize and describe visual elements for applications like image search. Generative AI in Computer Vision will teach you the foundations of modern computer vision and equip you with the practical techniques you need to bring your ideas to life.

In Generative AI in Computer Vision you’ll learn about:

Variational autoencoders (VAEs) and generative adversarial networks (GANs)
Diffusion models for high-quality image generation
Evaluating models with metrics such as inception score and Fréchet inception distance
Conditional and guided generation techniques
Bridging language and vision using transformers and models like CLIP
Implementing text-to-image models

Generative AI in Computer Vision guides you from core concepts of digital image creation to the cutting edge of AI-powered visual computing. You’ll unpack tools like DALL-E and Stable Diffusion and learn to build your own by following the detailed code samples and practical tutorials.

about the book

Generative AI in Computer Vision explores the inner workings of the generative AI models behind modern computer vision. You’ll start by developing a simple autoencoder and extending it into a variational autoencoder for image generation. Next, you’ll dive deep into GANs and discover how to upgrade their performance with next-generation techniques like Wasserstein GAN. Create your own denoising diffusion probabilistic models that can generate original imagery, and even implement a simplified text-to-image model! Plus, you’ll explore hybrid models that benefit from the strengths of multiple approaches, and even video-based generative AI. Throughout, real-world case studies demonstrate how these models can be put into action.

about the reader

For AI enthusiasts, developers, and data scientists familiar with machine learning basics and Python programming.

about the author

Vladimir Bok is a founding Applied AI Researcher at Mirage Security, a VC-backed startup developing AI simulations for enterprise security training and awareness. Prior to Mirage, Vladimir led AI/ML initiatives at tech industry giants including Meta and Microsoft, as well as various startups in ad tech, fintech, and biotech. He is a coauthor of GANs in Action. Vladimir holds a Computer Science degree from Harvard University.

eBook

$47.99 $26.39

you save $21.60 (45%)

choose your plan

pro

monthly

annual

$24.99

$249.99
only $20.83 per month

access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
choose another free product every time you renew
choose twelve free products per year
exclusive 50% discount on all purchases
Generative AI in Computer Vision ebook for free

team

monthly

annual

$49.99

$399.99
only $33.33 per month

five seats for your team
access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
choose another free product every time you renew
choose twelve free products per year
exclusive 50% discount on all purchases
Generative AI in Computer Vision ebook for free

more seats?

choose your plan

pro

monthly

annual

$24.99

$249.99
only $20.83 per month

access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
choose another free product every time you renew
choose twelve free products per year
exclusive 50% discount on all purchases
Generative AI in Computer Vision ebook for free

team

monthly

annual

$49.99

$399.99
only $33.33 per month

five seats for your team
access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
choose another free product every time you renew
choose twelve free products per year
exclusive 50% discount on all purchases
Generative AI in Computer Vision ebook for free

more seats?

pro $24.99 per month

lite $19.99 per month

team

about the book

about the reader

about the author

related titles

related titles

choose your plan

pro

team

choose your plan

pro

team