Generative AI in Computer Vision you own this product

GANs, diffusion models, and transformers
Vladimir Bok
  • MEAP began September 2024
  • Publication in Spring 2025 (estimated)
  • ISBN 9781633437449
  • 350 pages (estimated)
  • printed in black & white

pro $24.99 per month

  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose one free eBook per month to keep
  • exclusive 50% discount on all purchases

lite $19.99 per month

  • access to all Manning books, including MEAPs!

team

5, 10 or 20 seats+ for your team - learn more


Look inside
Master the essential models, algorithms, tools, and techniques for interpreting and generating images using AI.

From digital special effects to medical image augmentation and analysis, generative AI is revolutionizing the way we create and interpret visual information. Innovations including diffusion models, text-to-image generators, GANs, and more help you create photorealistic graphics, empower creative expression with text-to-image tools, and accurately recognize and describe visual elements for applications like image search. Generative AI in Computer Vision will teach you the foundations of modern computer vision and equip you with the practical techniques you need to bring your ideas to life.

In Generative AI in Computer Vision you’ll learn about:

  • Variational autoencoders (VAEs) and generative adversarial networks (GANs)
  • Diffusion models for high-quality image generation
  • Evaluating models with metrics such as inception score and Fréchet inception distance
  • Conditional and guided generation techniques
  • Bridging language and vision using transformers and models like CLIP
  • Implementing text-to-image models

Generative AI in Computer Vision guides you from core concepts of digital image creation to the cutting edge of AI-powered visual computing. You’ll unpack tools like DALL-E and Stable Diffusion and learn to build your own by following the detailed code samples and practical tutorials.

about the book

Generative AI in Computer Vision explores the inner workings of the generative AI models behind modern computer vision. You’ll start by developing a simple autoencoder and extending it into a variational autoencoder for image generation. Next, you’ll dive deep into GANs and discover how to upgrade their performance with next-generation techniques like Wasserstein GAN. Create your own denoising diffusion probabilistic models that can generate original imagery, and even implement a simplified text-to-image model! Plus, you’ll explore hybrid models that benefit from the strengths of multiple approaches, and even video-based generative AI. Throughout, real-world case studies demonstrate how these models can be put into action.

about the reader

For AI enthusiasts, developers, and data scientists familiar with machine learning basics and Python programming.

about the author

Vladimir Bok is a founding Applied AI Researcher at Mirage Security, a VC-backed startup developing AI simulations for enterprise security training and awareness. Prior to Mirage, Vladimir led AI/ML initiatives at tech industry giants including Meta and Microsoft, as well as various startups in ad tech, fintech, and biotech. He is a coauthor of GANs in Action. Vladimir holds a Computer Science degree from Harvard University.

choose your plan

team

monthly
annual
$49.99
$499.99
only $41.67 per month
  • five seats for your team
  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose another free product every time you renew
  • choose twelve free products per year
  • exclusive 50% discount on all purchases
  • Generative AI in Computer Vision ebook for free

choose your plan

team

monthly
annual
$49.99
$499.99
only $41.67 per month
  • five seats for your team
  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose another free product every time you renew
  • choose twelve free products per year
  • exclusive 50% discount on all purchases
  • Generative AI in Computer Vision ebook for free