Learn how to identify the unusual, interesting, extreme, or inaccurate parts of your data.
Data scientists have two main tasks: finding patterns in data and finding the exceptions. These outliers are often the most informative parts of data, revealing hidden insights, novel patterns, and potential problems.
Outlier Detection in Python is a practical guide to spotting the parts of a dataset that deviate from the norm, even when they're hidden or intertwined among the expected data points.
In
Outlier Detection in Python you'll learn how to:
- Use standard Python libraries to identify outliers
- Select the most appropriate detection methods
- Combine multiple outlier detection methods for improved results
- Interpret your results effectively
- Work with numeric, categorical, time series, and text data
Outlier detection is a vital tool for modern business, whether it's discovering new products, expanding markets, or flagging fraud and other suspicious activities. This guide presents the core tools for outlier detection, as well as techniques utilizing the Python data stack familiar to data scientists. To get started, you'll only need a basic understanding of statistics and the Python data ecosystem.