P1-M01 - Python Programming Fundamentals

Part 1 — Universal Foundation · Module 01 of 04

Python Programming Fundamentals

Master Python from scratch — the language of AI engineering

⏱ 3 Weeks 🟢 Beginner 🐍 Python 3 📋 Prerequisites: None 🛠 VS Code / Jupyter / Colab

🎯

What This Module Covers

Foundation

Python is the language of AI engineering. Full stop. Almost every library, API, and tutorial you will encounter over the next six months is in Python. This module takes you from zero to functional Python developer — able to write clean programs, handle files and APIs, and manage a codebase.

Core syntax — variables, data types, operators, strings, f-strings
Data structures — lists, tuples, dictionaries, sets and their use-cases
Control flow — if/elif/else, for loops, while loops, break/continue
Functions — parameters, return values, *args/**kwargs, lambda, list comprehensions
File I/O — reading and writing text and CSV files
Error handling — try/except/finally for robust production code
OOP basics — classes, objects, __init__, methods, encapsulation
Environment management — venv, pip, requirements.txt

⚡ SKIP IF: You already program in C/C++/Java — Python syntax for variables, loops, conditionals, functions, and basic OOP will feel very familiar. Spend 2–3 days scanning the syntax differences (no semicolons, indentation-based blocks, dynamic typing) and jump straight to the data structures section. The venv and pip section is worth reading regardless.

🗺️

Why Python for AI Engineering

Context

Python dominates AI/ML for concrete reasons — not just popularity:

Library ecosystem — NumPy, Pandas, Scikit-learn, PyTorch, LangChain, FastAPI are all Python-first
API SDKs — OpenAI, Anthropic, HuggingFace all ship Python SDKs as their primary interface
Rapid prototyping — interactive Jupyter notebooks let you experiment and iterate faster than compiled languages
Glue language — Python is the orchestration layer that connects your LLM, vector DB, REST API, and deployment pipeline
Job market — 90%+ of AI/ML job postings require Python as the primary language

The goal this month is not to become a Python expert — it is to stop Googling basic syntax and be able to build simple programs confidently.

🔗

Module Connections

Dependencies

This module feeds directly into:

P1-M02 (NumPy & Pandas) — requires list comprehensions, classes, and file I/O
P1-M03 (Dev Essentials) — requires understanding of pip, venv, and JSON handling
P1-M04 (FastAPI) — requires OOP, type hints, and async/await understanding
P4 (LLM APIs) — every API call is Python. Structured outputs use Pydantic (Python classes)

C/C++/Java background? Here is what maps directly:

Java classes → Python classes (simpler syntax, no access modifiers)
C arrays → Python lists (dynamic, mixed types)
C++ STL map → Python dict
Java try/catch → Python try/except

📦

Variables, Types and Operators

Week 1

Python is dynamically typed — you do not declare types. The interpreter infers them at runtime.

# Basic types
name    = "Ajay"          # str
age     = 28              # int
salary  = 85000.50        # float
active  = True            # bool
nothing = None            # NoneType

# Type checking and conversion
print(type(name))          # <class 'str'>
print(int("42"))           # 42  — explicit cast
print(str(100))            # "100"

# f-strings — the professional way to format
print(f"Hello {name}, age {age}")   # Hello Ajay, age 28
print(f"{salary:.2f}")              # 85000.50

💡 Unlike C/C++, Python variables are references, not memory slots. When you write x = 5, Python creates an integer object with value 5 and binds the name x to it. This matters for understanding mutability later.

🗂️

Core Data Structures

Week 1–2

List — ordered, mutable, allows duplicates

items = ["apple", "banana", "cherry"]
items.append("date")          # add to end
items.insert(1, "avocado")     # insert at index
items.pop()                    # remove last
print(items[0])               # "apple" — 0-indexed
print(items[-1])              # last element
print(items[1:3])             # slice [1,3) = ["avocado","banana"]

# List comprehension — Pythonic and fast
squares = [x**2 for x in range(10) if x % 2 == 0]
# [0, 4, 16, 36, 64]

Dictionary — key-value store, O(1) lookup

user = {"name": "Ajay", "age": 28, "city": "Mumbai"}
user["email"] = "ajay@example.com"   # add key
user.get("phone", "N/A")            # safe get with default

# Dict comprehension
word_len = {w: len(w) for w in ["python", "java", "c++"]}
# {"python": 6, "java": 4, "c++": 3}

# Iterating
for key, val in user.items():
    print(f"{key}: {val}")

Tuple — ordered, immutable

coords = (19.07, 72.87)          # lat, lon of Mumbai
lat, lon = coords                  # tuple unpacking

# Use tuples for fixed data that should not change
# e.g. HTTP status codes, RGB colours, database records
HTTP_OK = (200, "OK")

Set — unordered, unique elements

tags = {"python", "ml", "llm", "python"}  # duplicates removed
print(tags)  # {"python", "ml", "llm"}

# Set operations — fast membership testing O(1)
a = {1,2,3,4}
b = {3,4,5,6}
print(a & b)   # {3, 4}  — intersection
print(a | b)   # {1,2,3,4,5,6} — union
print(a - b)   # {1, 2}  — difference

🔄

Functions and Error Handling

Week 2–3

# Basic function with type hints (good practice)
def greet(name: str, greeting: str = "Hello") -> str:
    return f"{greeting}, {name}!"

# *args — variable positional arguments
def total(*numbers):
    return sum(numbers)

print(total(1, 2, 3, 4))   # 10

# **kwargs — variable keyword arguments
def create_profile(**fields):
    return {k: v for k, v in fields.items()}

profile = create_profile(name="Ajay", role="engineer")

# Lambda — one-line anonymous function
square = lambda x: x ** 2
print(sorted([3,1,4], key=lambda x: -x))  # [4, 3, 1]

# Error handling — always handle specific exceptions
def read_config(path: str) -> dict:
    try:
        with open(path, "r") as f:
            import json
            return json.load(f)
    except FileNotFoundError:
        print(f"Config not found: {path}")
        return {}
    except json.JSONDecodeError as e:
        print(f"Invalid JSON: {e}")
        return {}
    finally:
        print("Config read attempted")   # always runs

🏗️

Object-Oriented Programming

Week 3

# Class definition — blueprint for objects
class BankAccount:
    # Class variable (shared by all instances)
    bank_name = "PyBank"

    def __init__(self, owner: str, balance: float = 0.0):
        # Instance variables (unique per object)
        self.owner = owner
        self._balance = balance    # _prefix = convention for private

    def deposit(self, amount: float) -> None:
        if amount <= 0:
            raise ValueError("Amount must be positive")
        self._balance += amount

    def withdraw(self, amount: float) -> float:
        if amount > self._balance:
            raise ValueError("Insufficient funds")
        self._balance -= amount
        return amount

    @property
    def balance(self) -> float:        # getter — access like attribute
        return self._balance

    def __repr__(self) -> str:
        return f"BankAccount({self.owner!r}, {self._balance})"

# Usage
acc = BankAccount("Ajay", 1000)
acc.deposit(500)
print(acc.balance)   # 1500

💡 Python OOP is simpler than Java/C++ — no access modifiers, no header files. Convention: single underscore _name means "please don't touch this" (not enforced). Double underscore __name triggers name-mangling for true privacy.

📁

File I/O and JSON

Week 3

import json, csv
from pathlib import Path

# Writing and reading JSON (critical for LLM API work)
data = {"model": "claude-3", "temperature": 0.7, "tokens": [100, 200]}
Path("config.json").write_text(json.dumps(data, indent=2))
loaded = json.loads(Path("config.json").read_text())

# CSV reading — used constantly in data work
with open("students.csv", "r") as f:
    reader = csv.DictReader(f)
    students = list(reader)   # list of dicts, one per row

# CSV writing
with open("output.csv", "w", newline="") as f:
    writer = csv.DictWriter(f, fieldnames=["name", "score"])
    writer.writeheader()
    writer.writerows([{"name": "Ajay", "score": 95}])

🌍

Virtual Environments and Package Management

Essential

Every project must have its own virtual environment. This is non-negotiable — it prevents dependency conflicts between projects.

# Create and activate virtual environment
python -m venv .venv                 # create
source .venv/bin/activate            # Linux/Mac
.venv\Scripts\activate              # Windows

# Install packages
pip install requests pandas numpy    # install
pip install openai anthropic         # AI SDKs

# Freeze and restore dependencies
pip freeze > requirements.txt        # save exact versions
pip install -r requirements.txt      # restore on new machine

# Deactivate
deactivate

⚠️ Never install packages globally — always activate your venv first. Global installs create conflicts that are painful to debug. Add .venv/ to your .gitignore — never commit the venv folder.

🔍

Mutable vs Immutable — The Most Common Bug Source

Critical

Understanding mutability prevents an entire class of bugs that trip up engineers coming from C/C++/Java.

# Immutable — int, str, tuple, float, bool
x = 5
y = x
y = 10
print(x)   # Still 5 — y got a new object

# Mutable — list, dict, set
a = [1, 2, 3]
b = a              # b points to SAME list as a
b.append(4)
print(a)           # [1, 2, 3, 4]  ← a changed!

# Fix: explicit copy
b = a.copy()       # shallow copy
b = a[:]           # slice copy — same result
import copy
b = copy.deepcopy(a)   # deep copy for nested structures

# Dangerous default argument anti-pattern
def add_item(item, lst=[]):    # BAD — lst shared across calls!
    lst.append(item)
    return lst

# Correct pattern
def add_item(item, lst=None):
    if lst is None:
        lst = []
    lst.append(item)
    return lst

⚡

Comprehensions and Functional Patterns

Pythonic Code

# List comprehension — replaces most for loops
even_squares = [x**2 for x in range(20) if x % 2 == 0]

# Dict comprehension — used constantly with API responses
response_data = [{"id": 1, "name": "alice"}, {"id": 2, "name": "bob"}]
id_map = {item["id"]: item["name"] for item in response_data}
# {1: "alice", 2: "bob"}

# Generator — lazy evaluation, memory efficient for large data
def token_chunks(text: str, size: int):
    words = text.split()
    for i in range(0, len(words), size):
        yield " ".join(words[i:i+size])

# Use with large LLM context windows
for chunk in token_chunks(long_document, 500):
    process(chunk)   # never loads full doc into memory

# zip and enumerate — essential for pairing data
names  = ["alice", "bob", "charlie"]
scores = [85, 92, 78]
for i, (name, score) in enumerate(zip(names, scores)):
    print(f"{i}: {name} = {score}")

🔌

Modules, Imports and Project Structure

Production Habit

# Standard imports
import os, sys, json, csv
from pathlib import Path
from typing import Optional, List, Dict, Any

# Third-party imports (installed via pip)
import requests
from openai import OpenAI

# Relative imports in your own package
from .utils import format_response
from ..config import API_KEY

# Typical project structure
# my-ai-app/
# ├── main.py          ← entry point
# ├── config.py        ← constants, env vars
# ├── models/          ← Pydantic schemas
# │   └── __init__.py
# ├── services/        ← business logic
# │   ├── __init__.py
# │   └── llm.py
# ├── requirements.txt
# └── .env             ← secrets (never commit!)

# Reading environment variables (secrets pattern)
import os
from dotenv import load_dotenv
load_dotenv()                                 # loads .env file
api_key = os.environ.get("OPENAI_API_KEY")   # never hardcode keys

⏳

Async/Await — Critical for LLM APIs

Month 2 Preview

LLM API calls are I/O-bound — they wait for network responses. Async Python lets your program do other work while waiting, instead of blocking.

import asyncio

# Sync version — blocks for 3 seconds total
import time
def fetch_sync():
    time.sleep(1)   # simulates API call
    return "result"

# Async version — runs concurrently, total ~1 second
async def fetch_async():
    await asyncio.sleep(1)   # yields control while waiting
    return "result"

async def main():
    # Run 3 API calls concurrently
    results = await asyncio.gather(
        fetch_async(),
        fetch_async(),
        fetch_async()
    )
    return results

asyncio.run(main())   # entry point for async code

# Anthropic async client pattern (Month 2)
# async with anthropic.AsyncAnthropic() as client:
#     response = await client.messages.create(...)

💡 You do not need to master async now. The key insight is: async def defines a coroutine (a function that can pause), and await is where it pauses to let other work run. You will use this constantly when calling LLM APIs and building FastAPI endpoints.

3-WEEK STRUCTURED PLAN

Week	Topics	Daily Task / Mini-Project
Week 1	Install Python 3.10+ and VS Code. Variables, data types, type casting, string methods and f-strings. Lists — indexing, slicing, list methods. Tuples vs lists. Control flow: if/elif/else, for loops, while loops, break/continue/pass.	Day 1–2: Unit converter (km↔miles, °C↔°F). Day 3: String palindrome checker. Day 4–5: Shopping list CLI using lists (add, remove, display). Day 6–7: Number guessing game with while loop + score tracker.
Week 2	Dictionaries — CRUD operations, nested dicts, dict comprehensions. Sets — union, intersection, difference. Functions — defining, default args, args/*kwargs, lambda, list comprehensions. Modules and import system.	Day 1–2: Phone book CLI using dictionaries (add, search, delete, update). Day 3–4: Grade classifier using if/elif (A/B/C/D/F with GPA). Day 5–7: Word frequency counter — takes a text file, returns top-10 words using dicts + sorted + lambda.
Week 3	File I/O — open(), read(), write() with text and CSV. JSON — json.loads(), json.dumps(), working with nested structures. Error handling — try/except/finally, custom exceptions. OOP — classes, __init__, methods, @property. venv + pip + requirements.txt.	Day 1–2: CSV reader/writer for student grade data. Day 3–4: Bank Account class with deposit, withdraw, balance property. Day 5–7: Full milestone project — CLI Student Grade Management System (see Projects tab).

⚙️

Environment Setup — Do This First

Day 1

# 1. Install Python 3.10+ from python.org
# 2. Install VS Code + Python extension (Microsoft)
# 3. Or use Google Colab — zero setup, free GPU
#    https://colab.research.google.com/

# Verify installation
python --version    # Python 3.10.x or higher
pip --version       # pip 23.x

# Install core packages you will use throughout Part 1
pip install jupyter numpy pandas matplotlib requests python-dotenv

💡

The Most Important Learning Habit

Meta-Skill

The most common beginner mistake is consuming content passively — reading along, nodding, and never opening a code editor. Every concept in this module must be typed out and run. Not copy-pasted. Typed. Your fingers need to know the syntax before your brain does.

Open a Python REPL (python in terminal) and experiment immediately after each concept
Every error message is a learning opportunity — read it fully before searching
Push every mini-project to GitHub, even if it is 20 lines
If something works but you don't know why — break it intentionally and observe

FREE LEARNING RESOURCES

Type	Resource	Best For
Course	CS50P — Introduction to Programming with Python (Harvard, Free)	Best free Python course. Structured problem sets. Certificate on completion.
Video	Python for Beginners — freeCodeCamp (YouTube, 4.5 hrs)	Single video covering all fundamentals. Watch at 1.5x after Week 1.
Course	Python for Everybody — Coursera (Free to audit)	Best for absolute beginners. Dr. Chuck is exceptionally clear.
Docs	Official Python Tutorial — python.org	Authoritative reference. Dry but precise. Use as lookup, not primary resource.
Course	Kaggle Python Course (Free, Interactive)	Hands-on exercises with instant feedback. Great for Week 1–2.
Book	Automate the Boring Stuff with Python (Free online)	Project-oriented. Best book for building real scripts in Week 3.
Video	Corey Schafer — Python OOP Tutorials (YouTube Playlist)	Best OOP explanation for engineers coming from Java/C++.
Tool	Google Colab — Free Cloud Jupyter Notebooks	Zero setup. Free GPU. Use if local setup is painful.

PRACTICE DATASET

Type	Resource	Used In
Dataset	UCI Student Performance Dataset	Milestone project — CLI Grade Management System

MILESTONE PROJECT

🛠 CLI Student Grade Management System [Beginner] 3–4 days · Week 3

Build a command-line application that manages student grade data using Python fundamentals. This project tests every concept from the module in a cohesive real-world context.

Requirements

Reads student data from a CSV file (name, subject scores)
Calculates grade (A/B/C/D/F), GPA, and class rank for each student
Supports filtering by subject or grade range
Sorts students by any column (name, GPA, specific subject)
Handles invalid input gracefully with try/except (missing file, bad data)
Writes a cleaned summary CSV as output
CLI menu: view all / search by name / filter / sort / export / quit

Stretch Goals

Add a Student class with methods (calculate_gpa, get_grade, __repr__)
Store data in JSON format as an alternative to CSV
Add a simple stats report: class average, highest/lowest scorer, grade distribution

Skills demonstrated: File I/O, CSV handling, dictionaries, lists, functions, error handling, OOP basics, sorting with lambda, string formatting

Dataset: UCI Student Performance Dataset or create your own CSV

Push to GitHub with a README describing what the tool does and how to run it.

MINI-PROJECTS (WEEKLY)

🛠Week 1 — Unit Converter CLI1–2 days

Build a CLI tool that converts between: km↔miles, °C↔°F, kg↔lbs. Menu-driven loop. Handles invalid input. Demonstrates: variables, type casting, f-strings, conditionals, while loop.

🛠Week 2 — Word Frequency Analyser2–3 days

Takes a .txt file as input. Returns the top-10 most frequent words (excluding common stop words). Uses: file I/O, dicts, sorted() with lambda, set for stop words. Try it on a book chapter from Project Gutenberg.

🛠Week 3 — Public API Script1–2 days

Call the Open-Meteo weather API (no API key needed) using the requests library. Format and print a 7-day forecast. Save the raw JSON response to a file. Push to GitHub with README. This is a preview of Month 2's API work.

import requests, json
url = "https://api.open-meteo.com/v1/forecast?latitude=19.07&longitude=72.87&daily=temperature_2m_max&timezone=Asia/Kolkata"
r = requests.get(url)
data = r.json()
print(json.dumps(data, indent=2))

LAB 1

Python REPL Exploration — Types and Mutability

Objective: Build intuition for Python's type system and mutability through hands-on exploration in the REPL.

Open a terminal and run python3 (or python on Windows). You are now in the interactive REPL (Read-Eval-Print Loop). Type expressions and see results immediately.

Run these lines one by one and note the output: x = [1, 2, 3], then y = x, then y.append(4), then print(x). Did x change? Why?

Now try: a = "hello", then b = a, then b = b + " world", then print(a). Did a change? Why not? What is the key difference between strings and lists?

Run import sys then sys.getsizeof([]) vs sys.getsizeof([1,2,3,4,5]). See memory usage grow. Try the same with a string of different lengths.

Test the mutable default argument bug: define def add(x, lst=[]): lst.append(x); return lst. Call it three times: add(1), add(2), add(3). What happens? Fix the function.

Bonus: Use id() to see object identity: a = [1,2,3]; b = a; print(id(a) == id(b)). Then do b = a.copy(); print(id(a) == id(b)). What changes?

LAB 2

Build a JSON Config Reader with Error Handling

Objective: Write production-quality Python that reads configuration files robustly — a pattern you will use in every AI project.

Create a file called config.json with this content: {"model": "gpt-4", "temperature": 0.7, "max_tokens": 1000, "api_key": "sk-test"}

Write a function load_config(path: str) -> dict that reads this file. Use try/except to handle: FileNotFoundError (return empty dict), json.JSONDecodeError (print error, return empty dict), PermissionError (print error, return empty dict).

Add a validate_config(config: dict) -> bool function that checks: "model" key exists, "temperature" is between 0 and 2, "max_tokens" is a positive integer. Return True only if all pass.

Test with: valid config, missing file, invalid JSON (break the JSON manually), missing required key, temperature = 3.0. Confirm each error is handled cleanly.

Add a save_config(config: dict, path: str) -> None function that writes back to JSON with 2-space indentation. Add a timestamp field: config["last_updated"] = datetime.now().isoformat()

Extension: Use os.environ.get() to override the api_key from an environment variable instead of reading it from the file. This is the secure pattern used in all production AI projects.

LAB 3

OOP — Build a Student Registry Class

Objective: Apply OOP concepts to build a reusable data class — a preview of the Pydantic models you will use in Part 4.

Create a Student class with: __init__(self, name, scores: dict) where scores is a dict of subject→score pairs. Store both as instance variables.

Add a gpa property that calculates the average of all scores. Add a grade property that returns "A" if gpa >= 90, "B" if >= 80, etc.

Add __repr__ and __str__ methods. __repr__ should be unambiguous (useful for debugging). __str__ should be human-readable.

Create a StudentRegistry class that holds a list of Student objects. Add methods: add(student), find(name), top_n(n) (returns top n by GPA), class_average().

Add to_csv(path) and from_csv(path) class methods to the registry for persistence. Test the full round-trip: create → save → load → query.

P1-M01 MASTERY CHECKLIST

Can explain the difference between mutable and immutable types and give one real bug this causes
Can write a list comprehension that filters and transforms a list in one line
Can define a function with default arguments, *args, and **kwargs and explain when to use each
Know the difference between a list, tuple, set, and dict — and when to use each
Can read and write JSON and CSV files using the standard library
Can handle FileNotFoundError, json.JSONDecodeError, and ValueError cleanly with try/except
Can create a class with __init__, instance variables, properties, and __repr__
Know what a virtual environment is, can create one, activate it, and install packages
Can read an API key from environment variables (not hardcoded in source)
Know what async def and await mean conceptually and why LLM APIs use them
Completed Lab 1: REPL exploration of types and mutability
Completed Lab 2: JSON config reader with full error handling
Completed Lab 3: Student class with OOP patterns
Milestone project pushed to GitHub with README

✅ When complete: Move to P1-M02 — NumPy & Pandas Data Toolkit. The list/dict/CSV skills you built here directly underpin everything in NumPy array indexing and Pandas DataFrame operations.

← AI/ML Roadmap 🗺️ All Modules Next: P1-M02 — NumPy & Pandas →