Using Undersampling you own this product

prerequisites
basic Python • basic pandas • basic scikit-learn • basics of machine learning
skills learned
run ClusterCentroids using the Imblearn library • run CondensedNearestNeighbor using the Imblearn library • run NearMiss using the imblearn library
Stylios Kampakis and Shreesha Jagadeesh
1 week · 5-8 hours per week · INTERMEDIATE

pro $24.99 per month

  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose one free eBook per month to keep
  • exclusive 50% discount on all purchases

lite $19.99 per month

  • access to all Manning books, including MEAPs!

team

5, 10 or 20 seats+ for your team - learn more


Look inside

In this liveProject, you’ll utilize undersampling techniques to balance out a seismic activity dataset. To balance this dataset, you will utilize the ClusterCentroids, NearMiss and CondensedNearestNeighbor algorithms to downsample the majority class. Then, the performance is compared using random forest, logistic regression and Naive Bayes binary classification algorithms.

This project is designed for learning purposes and is not a complete, production-ready application or solution.

project authors

Stylianos Kampakis
Dr. Stylianos (Stelios) Kampakis is a data scientist with more than 10 years of experience. He has worked with decision-makers from companies of all sizes from startups to organizations like the US Navy, Vodafone, and British Land. He has also helped many people follow a career in data science and technology. He is a member of the Royal Statistical Society, honorary research fellow at the UCL Centre for Blockchain Technologies, a data science advisor for London Business School and CEO of The Tesseract Academy. A natural polymath with a PhD in machine learning and degrees in artificial intelligence, statistics, psychology, and economics, he loves using his broad skillset to solve difficult problems and help companies improve their efficiency.
Shreesha Jagadeesh
Shreesha Jagadeesh is a product manager at Amazon creating data science-driven HR products for talent retention, career growth and internal mobility. He has previously worked as a manager at Ernst & Young where he led a large global team of 25+ data scientists and engineers to apply data science-driven digital transformation of their tax business units. Aside from his day job, he is a startup advisor helping young companies build out their data science functions. He has a master’s in electrical and computer engineering from the University of Toronto. He has been teaching for more than a decade and has written data science articles on Medium, reviewed other Manning courses and developed a popular Udemy course for Agile data science.

prerequisites

This liveProject is for Python programmers who are interested in exploring machine learning. To begin this liveProject, you will need to be familiar with the following:


TOOLS
  • Basic Python
  • Basic pandas
  • Basic scikit-learn
  • Basic Mat2Py
TECHNIQUES
  • Basics of machine learning (supervised learning)

features

Self-paced
You choose the schedule and decide how much time to invest as you build your project.
Project roadmap
Each project is divided into several achievable steps.
Get Help
While within the liveProject platform, get help from other participants and our expert mentors.
Compare with others
For each step, compare your deliverable to the solutions by the author and other participants.
book resources
Get full access to select books for 90 days. Permanent access to excerpts from Manning products are also included, as well as references to other resources.

choose your plan

team

monthly
annual
$49.99
$399.99
only $33.33 per month
  • five seats for your team
  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose another free product every time you renew
  • choose twelve free products per year
  • exclusive 50% discount on all purchases
  • Using Undersampling project for free