TAMU Datathon  Learner Track
Lecture 1: Data Basics
 The basic vocabulary of data:
 Tabled Data vs Unstructured Data
 Features and Target Columns
 Samples (data rows)
 The basics of reading/loading data using Pandas
 Deal with missing values

Plots: probability distribution and correlation (heatmaps) plots
 How to visualize data and understand it better:
 Exploratory analysis
 Communicate effectively
 How to make recommendations to stakeholders
Lecture 2: Linear Regression
 What is a model? What makes a model linear?
 Fit your own model by hand!
 What does it mean to fit a model? (What is machine learning?)

A visual introduction to optimization  "What is a Loss Function?"
 How to train a linear model using the Scikit Learn library
 How to interpret a linear regression model
 Understand the interpretability <> accuracy tradeoff
Lecture 3: Logistic Regression
 What is classification?
 What is logistic regression?
 A visual introduction to nonconvex optimization
 Fit a logistic regression model yourself!
 Fit a logistic regression model using the Scikit Learn library

Fake News!! What is accuracy, true/false positives/negatives,
precision, recall?
 Generalize logistic to multiple classes
 Visualize 2D decision boundaries
15 hours of hacking time to work on a data science project to solve a
reallife challenge.
Back