Skip to content

quickskilling/IntroDataScience

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Introduction to Data Science 📊

A beginner-friendly, hands-on introduction to modern data science tools and techniques

For: First-time coders learning data science
Location: Johannesburg SA 2026 Trainings
Duration: ~3 hours


🎯 What You'll Learn

By the end of this course, you'll be able to:

  • Write Python code for data analysis
  • Load and manipulate data with Polars
  • Create interactive visualizations with Plotly
  • Use AI tools (GitHub Copilot) to help you code

🚀 Quick Start (5 Minutes)

Prerequisites

  • Required: None! This course is designed for complete beginners
  • Equipment: Computer with internet (Mac, Windows, or Linux)

Setup Steps

  1. Install VS Code (if you haven't already)

  2. Install UV and add to PATH (Python package manager)

    # macOS/Linux (Ubuntu or Debian)
    curl -LsSf https://astral.sh/uv/install.sh | sh
    source $HOME/.local/bin/env
    
    # Windows (PowerShell)
    powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
    
    # Should be added to PATH already, but in case
    [Environment]::SetEnvironmentVariable("Path", $env:Path + ";C:\Users\$env:USERNAME\.cargo\bin", "User")
    # Verify Installation
    uv --version
  3. Install Python with UV

    uv python install
    # Verify Installation
    python --version
  4. Install Git and verify installation

    # Windows (Powershell)
    winget install --id Git.Git -e --source winget
    
    # macOS
    brew install git
    
    # Linux (Ubuntu or Debian)
    apt-get install git
    # Verify Installation
    git --version
  5. Clone this repository

    git clone https://github.com/quickskilling/IntroDataScience.git
    cd IntroDataScience
  6. Sync dependencies

    uv sync
  7. Launch your first notebook

    uv run marimo edit example_notebooks/01_python_basics.py

📖 Need more help? Check out the detailed setup guide


📚 Learning Path

Overview and Setup (40 minutes)

✓ Overview of Data Science and Tools ✓ Install VS Code and UV
✓ Clone repository and sync dependencies
✓ Launch first notebook

Phase 1: Learn the Basics (1 hour)

📓 Notebook 1: Python Basics (20 min)

  • Variables, data types, and operations
  • Lists and dictionaries
  • Control flow (if/else, loops)
  • Functions

📓 Notebook 2: Data Wrangling (20 min)

  • Loading data (CSV, JSON, Parquet)
  • Exploring DataFrames with Polars
  • Filtering, selecting, and transforming data
  • Joining datasets

📓 Notebook 3: Plotting (20 min)

  • Line charts and bar charts
  • Scatter plots and histograms
  • Customizing visualizations
  • Subplots

Phase 2: Practice (1 hour)

✏️ Exercise 1: Fundamentals (20 min)

  • Practice Python basics
  • 8-10 hands-on exercises

✏️ Exercise 2: Basic Data Wrangling with Polars (20 min)

  • Load and manipulate real datasets
  • Answer questions with data

✏️ Exercise 3: Basic Plotting with Plotly (20 min)

  • Create visualizations from data
  • Hands-on exercises

Total Time: ~3 hours


🛠️ Technology Stack

Our modern, industry-standard tools:

  • Python - The most popular data science language
  • UV ⚡ - Lightning-fast package manager
  • Marimo 📓 - Reactive Python notebooks
  • Polars 🐻‍❄️ - Blazing-fast data manipulation
  • Plotly 📊 - Interactive visualizations
  • GitHub Copilot 🤖 - AI coding assistant

📁 Repository Structure

📦 IntroDataScience/
├── 📓 example_notebooks/  → Learning notebooks (start here!)
│   ├── 01_python_basics.py
│   ├── 02_data_wrangling.py
│   └── 03_plotting.py
│
├── ✏️ exercises/           → Practice problems
│   ├── ex01_fundamentals.py
│   ├── ex02_wrangle_and_plot.py
│   └── ex03_mini_project.py
│
│
├── 📊 data/                → Sample datasets
│   └── raw/
│       ├── students.csv
│       ├── sales.json
│       └── weather.parquet
│
└── 📚 docs/                → Tool guides and documentation

✅ Progress Tracker

Track your journey through the course:

Setup

  • Install VS Code and extensions
  • Install Python
  • Install UV and dependencies
  • Install Git and clone repository
  • Run first notebook successfully

Learn (Example Notebooks)

  • 01: Python Basics (20 min)
  • 02: Data Wrangling (20 min)
  • 03: Plotting (20 min)

Practice (Exercises)

  • Exercise 1: Fundamentals (20 min)
  • Exercise 2: Basic Data Wrangling with Polars (20 min)
  • Exercise 3: Basic Plotting with Plotly (20 min)

You're Done! 🎉

  • Share your work
  • Keep learning with your own data

💡 Getting Help

  • Read error messages carefully - They often tell you exactly what's wrong!
  • Check the docs - Each tool we use has a guide in the /docs folder
  • Ask your neighbor - Pair programming can improve efficancy and help you learn
  • Ask GitHub Copilot - If you have it installed, use Copilot Chat for help
  • Raise your hand - A DataThink representative will be happy to help if you get stuck

Ready to start? Jump to example_notebooks/01_python_basics.py 🚀

Contributors

Languages