TAMU Datathon - Learner Track
Lecture 1: Data Basics
- The basic vocabulary of data:
- Tabled Data vs Unstructured Data
- Features and Target Columns
- Samples (data rows)
- The basics of reading/loading data using Pandas
- Deal with missing values
Plots: probability distribution and correlation (heatmaps) plots
- How to visualize data and understand it better:
- Exploratory analysis
- Communicate effectively
- How to make recommendations to stakeholders
Lecture 2: Linear Regression
- What is a model? What makes a model linear?
- Fit your own model by hand!
- What does it mean to fit a model? (What is machine learning?)
A visual introduction to optimization - "What is a Loss Function?"
- How to train a linear model using the Scikit Learn library
- How to interpret a linear regression model
- Understand the interpretability <> accuracy tradeoff
Lecture 3: Logistic Regression
- What is classification?
- What is logistic regression?
- A visual introduction to non-convex optimization
- Fit a logistic regression model yourself!
- Fit a logistic regression model using the Scikit Learn library
Fake News!! What is accuracy, true/false positives/negatives,
- Generalize logistic to multiple classes
- Visualize 2D decision boundaries
15 hours of hacking time to work on a data science project to solve a