5, 10 or 20 seats+ for your team - learn more
Finative, the environmental, social, and governance (ESG) analytics company you work for, analyzes a high volume of data using advanced natural language processing (NLP) techniques to provide its clients with valuable insights about their sustainability. Your CEO has concerns that some of the companies Finative analyzes may be greenwashing: spreading disinformation about their sustainability in order to appear more environmentally conscious than they actually are.
As a data scientist for Finative, your task is to validate your sustainability reports by creating and analyzing them. You’ll compute conditional probability with Bayes’ Theorem, by hand, to better understand your model’s performance through metrics such as recall and precision. You’ll learn an efficient way to prepare your data from different sources and merge it into one dataset, which you’ll use to prepare tweets. To successfully classify the tweets, you’ll use a pre-trained large language model and fine-tune it using the Hugging Face ecosystem as well as hyperopt and Ray Tune. You’ll use TensorBoard and Weights & Biases to analyze and track your experiments, and you’ll analyze the tweets to determine whether enough negative sentiment exists to indicate that the company you analyzed has been greenwashing its data.
This liveProject is for ML engineers, intermediate-level Python programmers, and early-stage data scientists who are familiar with the basics of probability. To begin these liveProjects you’ll need to be familiar with the following:
TOOLS