Machine Learning

This lesson will teach you how to collect and clean data to prepare it for machine learning.

What is Data Treatment?

raw data

clean data

dataset treatment

dataset cleaning

data pre-processing

Step 1: Collecting Data

Kaggle - The world's largest data science community and hosts many useful resources, including a large and detailed dataset repository.
Google Dataset Search - This doesn't have its own datasets but it is a great resource for finding datasets from other websites.
UCI Machine Learning Repository - This is another website that maintains over 490 datasets.

Step 2: Data Profiling

data profiling

Step 3: Formatting Data

$

USD

Step 4: Feature Engineering

feature engineering

feature extraction

stay duration

Step 5: Splitting Data

Training Set:

Testing Set:

Activity: Explore Data