Week 2: Data Structures & File Handling

👋 Welcome to Week 2: Data Structures & File Handling!

This week, we level up from basic Python to working with collections of data. Think of Week 1 as learning individual notes on a piano—this week, you're learning chords. You'll discover how to organize, store, retrieve, and persist data using Python's powerful built-in data structures.

🎯 Learning Objectives

Lists — Work with ordered, mutable collections of items
Dictionaries — Store and retrieve data using key-value pairs
Tuples & Sets — Understand immutable sequences and unique collections
File Operations — Read from and write to text and CSV files
Modules & Packages — Import external code and use pip for package management
Best Practices — Write clean, Pythonic code that handles errors gracefully

📚 Key Vocabulary

Mutable Can be modified after creation. Lists and dictionaries are mutable—you can add, remove, or change their contents.

Immutable Cannot be changed after creation. Strings, tuples, and numbers are immutable—any "change" creates a new object.

Iterable Any object you can loop over with for item in collection. Lists, tuples, strings, and dictionaries are all iterables.

Key-Value Pair A data structure where each key maps to a value, like a real dictionary maps words to definitions.

Module A Python file containing reusable code (functions, classes, variables). Import with import module_name.

Package A collection of modules organized in directories. Install external packages with pip install package_name.

CSV Comma-Separated Values—a simple file format for storing tabular data. Each line is a row, commas separate columns.

Context Manager Python's with statement that automatically handles resource cleanup (like closing files). Use with open(file) as f:.

💡 How to Use This Lesson

Progressive Examples: Each topic shows multiple approaches—from basic working code to professional best practices. Pay attention to WHY each version improves on the previous one.

Run the Code: Click the "Run" buttons to see output. Modify examples and break things on purpose—that's how you learn!

Read Comments Carefully: Code comments explain the "why" behind each decision. They're teaching tools, not just documentation.

Practice with Real Data: After each section, try the concepts with your own data. The best way to learn is by doing.

📋 Lists: Dynamic Ordered Collections

Lists are Python's most versatile data structure. Think of them as dynamic arrays that can grow, shrink, and hold any type of data. They're ordered (items have positions), mutable (you can change them), and allow duplicates.

🎧 Analogy: Lists are Playlists

A list is like your music playlist: the order matters, you can add and remove songs, and you can have the same song twice if you want. Index 0 is the first track.

Creating and Accessing Lists

Python ✓ BEST PRACTICE

# ═══ CREATING LISTS ═══
# Empty list
tasks = []

# List with initial values
fruits = ['apple', 'banana', 'cherry']

# Lists can hold mixed types (but usually don't in practice)
mixed = [42, 'hello', 3.14, True]

# ═══ ACCESSING ITEMS ═══
print(fruits[0])        # First item: 'apple'
print(fruits[-1])       # Last item: 'cherry'
print(fruits[1:3])      # Slice: ['banana', 'cherry']

# ═══ MODIFYING LISTS ═══
fruits[0] = 'orange'    # Change first item
fruits.append('date')     # Add to end
fruits.insert(1, 'kiwi')  # Insert at index 1
fruits.remove('banana')   # Remove by value
removed = fruits.pop()     # Remove and return last item

print(fruits)  # ['orange', 'kiwi', 'cherry']

� Try It Yourself: To‑Do List Manager

Create a list of tasks. Add tasks with .append(), insert urgent tasks, remove completed tasks, and print the remaining list.

📓 Practice in Notebook

Open notebook-sessions/week2/session1_data_structures.ipynb and implement the to‑do list manager with your own tasks.

🏋️ Exercise: Shopping Cart

Build a shopping cart system:

Create an empty cart list
Add 5 items using .append()
Insert a "priority" item at index 0
Remove 2 items using .remove() and .pop()
Print the final cart and its length

Bonus: Store items as (name, price) tuples and calculate total!

🌟 Challenge: Grade Book Statistics

Create a grade tracking system:

Store grades: [85, 92, 78, 95, 88, 76, 91]
Calculate: average, highest, lowest grades
Count passing grades (>= 70)
Create letter grades list (A/B/C/D/F)

# Hint: Use list comprehensions!
passing = [g for g in grades if g >= 70]

� List Indexing: Zero-Based and Negative

Zero-based indexing: The first item is at index 0, the second at 1, etc. This is standard in most programming languages.

Negative indexing: Count backwards from the end. -1 is the last item, -2 is second-to-last, etc.

Slicing: list[start:end] gets items from start up to (but not including) end. Omit start for beginning, omit end for end of list.

Essential List Methods

Python ✓ BEST PRACTICE

numbers = [3, 1, 4, 1, 5, 9]

# ═══ COMMON OPERATIONS ═══
print(len(numbers))           # Length: 6
print(numbers.count(1))       # Count occurrences: 2
print(numbers.index(4))       # Find index of value: 2

# ═══ SORTING & REVERSING ═══
numbers.sort()                  # Sort in place (modifies original)
numbers.reverse()               # Reverse in place

sorted_copy = sorted(numbers)  # Returns new sorted list (doesn't modify original)

# ═══ LIST COMPREHENSIONS (Advanced but Pythonic) ═══
squares = [x**2 for x in range(1, 6)]  # [1, 4, 9, 16, 25]
evens = [x for x in numbers if x % 2 == 0]    # Filter evens

# ═══ ITERATION ═══
for fruit in fruits:
    print(fruit)

# With index using enumerate()
for i, fruit in enumerate(fruits):
    print(f"{i}: {fruit}")

🎯 When to Use Lists

✓ Use lists when:

You need an ordered sequence of items
You'll add/remove items frequently
Order matters (first place, second place, etc.)
You need to access items by position (index)

✗ Don't use lists when:

You need fast lookups by key (use dictionaries)
You need to ensure uniqueness (use sets)
Data shouldn't change (use tuples)

🗃️ Dictionaries: Key-Value Mapping

Dictionaries are Python's implementation of hash maps or associative arrays. They store data as key-value pairs, like a real dictionary maps words to definitions. They're unordered (no positions), mutable (you can change them), and keys must be unique.

📖 Analogy: Phonebook / JSON Object

A dictionary is like a phonebook: you look up a person (key) and get their number (value). In web APIs, JSON objects map property names to values the same way.

Creating and Accessing Dictionaries

Python ✓ BEST PRACTICE

# ═══ CREATING DICTIONARIES ═══
# Empty dictionary
user = {}

# Dictionary with initial values
person = {
    'name': 'Alice',
    'age': 30,
    'city': 'New York',
    'is_student': False
}

# ═══ ACCESSING VALUES ═══
print(person['name'])           # 'Alice' (raises KeyError if key missing)
print(person.get('age'))       # 30 (returns None if key missing)
print(person.get('job', 'N/A'))  # 'N/A' (default value if key missing)

# ═══ ADDING/MODIFYING VALUES ═══
person['email'] = 'alice@example.com'  # Add new key-value pair
person['age'] = 31                        # Update existing value

# ═══ REMOVING VALUES ═══
del person['is_student']  # Remove key-value pair
removed = person.pop('city')     # Remove and return value

🚀 Try It Yourself: Contact Book

Build a contact book with names as keys and emails as values. Add new contacts, update emails, and safely read with .get().

📓 Practice in Notebook

Open notebook-sessions/week2/session1_data_structures.ipynb and implement the contact book with at least 5 entries.

🏋️ Exercise: Student Database

Build a student records system:

Create a dictionary with student IDs as keys
Each value should be another dictionary with name, grade, and major
Add 3 students, update one grade, and safely lookup a non-existent ID

students = {
    101: {'name': 'Alice', 'grade': 92, 'major': 'CS'},
    # Add more students...
}

🌟 Challenge: Word Frequency Counter

Count word occurrences in a sentence:

Take a sentence and split it into words
Count how many times each word appears
Find the most common word

text = "the quick brown fox jumps over the lazy dog the fox"
word_count = {}
for word in text.split():
    word_count[word] = word_count.get(word, 0) + 1

💡 Dictionary Access: Bracket vs .get()

Bracket notation dict[key]: Fast and direct, but raises KeyError if key doesn't exist. Use when you're certain the key exists.

The .get() method: Safer option that returns None (or a default value) if key is missing. Use when key might not exist.

Pro tip: Use .get() with a default value for optional configuration settings or user data that might be incomplete.

Dictionary Methods and Iteration

Python ✓ BEST PRACTICE

scores = {'Alice': 95, 'Bob': 87, 'Charlie': 92}

# ═══ USEFUL METHODS ═══
print(scores.keys())      # dict_keys(['Alice', 'Bob', 'Charlie'])
print(scores.values())    # dict_values([95, 87, 92])
print(scores.items())     # dict_items([('Alice', 95), ...])

# Check if key exists
if 'Alice' in scores:
    print("Alice's score found!")

# ═══ ITERATION PATTERNS ═══
# Loop through keys
for name in scores:
    print(name)

# Loop through values
for score in scores.values():
    print(score)

# Loop through key-value pairs (MOST COMMON)
for name, score in scores.items():
    print(f"{name} scored {score}")

# ═══ DICTIONARY COMPREHENSION ═══
# Create dict from lists
names = ['Alice', 'Bob', 'Charlie']
ages = [25, 30, 35]
people = {name: age for name, age in zip(names, ages)}

# Filter and transform
passing = {name: score for name, score in scores.items() if score >= 90}

Real-World Example: User Profile

Python ✓ BEST PRACTICE

# ★ PRACTICAL EXAMPLE: Building a user profile system

# Create user profile with nested dictionaries
user_profile = {
    'username': 'alice123',
    'email': 'alice@example.com',
    'preferences': {
        'theme': 'dark',
        'notifications': True,
        'language': 'en'
    },
    'tags': ['python', 'data science', 'ml']
}

# Safe access to nested values
theme = user_profile.get('preferences', {}).get('theme', 'light')
print(f"User theme: {theme}")  # dark

# Update nested dictionary
user_profile['preferences']['notifications'] = False

# Merge dictionaries (Python 3.9+)
defaults = {'theme': 'light', 'font_size': 14}
user_prefs = user_profile['preferences']
combined = defaults | user_prefs  # user_prefs overrides defaults

# Alternative: .update() method (modifies in place)
defaults.update(user_prefs)

🎯 When to Use Dictionaries

✓ Use dictionaries when:

You need fast lookups by name/identifier (O(1) average time)
You're mapping relationships (user ID → user data)
You're working with JSON-like data
Keys are meaningful labels, not just positions

Common use cases:

Configuration settings: config = {'debug': True, 'port': 8000}
Counting occurrences: word_count = {'hello': 3, 'world': 2}
Caching results: cache = {input_val: computed_result}
Database records: user = {'id': 1, 'name': 'Alice'}

⚠️ Common Dictionary Pitfalls

Modifying while iterating: Don't add/remove keys while looping through the dict. Create a copy first: for key in list(dict.keys()):

KeyError exceptions: Always use .get() for optional keys or wrap bracket access in try/except.

Unhashable keys: Keys must be immutable. You can't use lists or dicts as keys (use tuples instead).

📦 Tuples & Sets: Immutable and Unique Collections

Tuples and sets round out Python's core data structures. Tuples are like immutable lists—perfect for fixed data that shouldn't change. Sets are unordered collections of unique items—great for membership testing and removing duplicates.

🔒 / 🎟️ Analogies: Locked Bag & Sticker Collection

Tuple: a locked bag—you can look inside, but you can’t change its contents. Set: a sticker collection—duplicates are automatically removed and order doesn’t matter.

Tuples: Immutable Sequences

Python ✓ BEST PRACTICE

# ═══ CREATING TUPLES ═══
# Tuples use parentheses (but they're optional for assignment)
coordinates = (10, 20)
rgb_color = (255, 128, 0)

# Single-element tuple needs trailing comma
single = (42,)  # Without comma, it's just a number in parentheses

# Empty tuple
empty = ()

# Tuple unpacking (very Pythonic!)
x, y = coordinates            # x=10, y=20
red, green, blue = rgb_color  # red=255, green=128, blue=0

# ═══ ACCESSING TUPLE ITEMS ═══
print(coordinates[0])  # 10 (indexing works like lists)
print(coordinates[-1]) # 20 (negative indexing works too)

# ═══ TUPLES ARE IMMUTABLE ═══
# coordinates[0] = 15  # This raises TypeError!

# But you can create a new tuple
new_coords = (15, coordinates[1])  # (15, 20)

💡 Why Use Tuples? Immutability Benefits

Memory efficient: Tuples use less memory than lists because Python knows they won't change.

Dictionary keys: Unlike lists, tuples can be dictionary keys (they're hashable): positions = {(0, 0): 'origin'}

Data integrity: Use tuples for data that shouldn't be modified, like configuration constants or database records.

Function returns: Return multiple values from functions: return (status, result, error)

Common Tuple Use Cases

Python ✓ BEST PRACTICE

# ★ MULTIPLE RETURN VALUES
def get_user_info():
    name = "Alice"
    age = 30
    city = "NYC"
    return name, age, city  # Returns a tuple (parentheses optional)

user_name, user_age, user_city = get_user_info()  # Unpack immediately

# ★ SWAP VALUES WITHOUT TEMP VARIABLE
a = 5
b = 10
a, b = b, a  # Now a=10, b=5 (Python uses tuples internally!)

# ★ NAMED TUPLES (from collections module)
from collections import namedtuple

Point = namedtuple('Point', ['x', 'y'])
p = Point(10, 20)
print(p.x, p.y)  # More readable than p[0], p[1]

# ★ ENUMERATE WITH TUPLES
fruits = ['apple', 'banana', 'cherry']
for index, fruit in enumerate(fruits):
    print(f"{index}: {fruit}")  # enumerate returns (index, value) tuples

Sets: Collections of Unique Items

Python ✓ BEST PRACTICE

# ═══ CREATING SETS ═══
# Sets use curly braces (like dicts, but no key-value pairs)
unique_numbers = {1, 2, 3, 3, 4}  # Duplicates automatically removed
print(unique_numbers)  # {1, 2, 3, 4}

# Create set from list (removes duplicates)
list_with_dupes = [1, 2, 2, 3, 3, 3]
unique_set = set(list_with_dupes)  # {1, 2, 3}

# Empty set (can't use {} because that's an empty dict!)
empty_set = set()

# ═══ ADDING AND REMOVING ═══
tags = {'python', 'data'}
tags.add('ml')           # Add single item
tags.update(['ai', 'cloud'])  # Add multiple items
tags.remove('data')      # Remove (raises KeyError if not found)
tags.discard('data')     # Remove (silent if not found)

# ═══ MEMBERSHIP TESTING (FAST!) ═══
if 'python' in tags:
    print("Python tag found!")  # O(1) average time vs O(n) for lists

Set Operations: Mathematical Power

Python ✓ BEST PRACTICE

# ★ SET OPERATIONS
python_devs = {'Alice', 'Bob', 'Charlie'}
javascript_devs = {'Bob', 'Diana', 'Eve'}

# Union (all items from both sets)
all_devs = python_devs | javascript_devs
# Or: all_devs = python_devs.union(javascript_devs)
print(all_devs)  # {'Alice', 'Bob', 'Charlie', 'Diana', 'Eve'}

# Intersection (items in both sets)
fullstack = python_devs & javascript_devs
# Or: fullstack = python_devs.intersection(javascript_devs)
print(fullstack)  # {'Bob'}

# Difference (items in first set but not second)
python_only = python_devs - javascript_devs
# Or: python_only = python_devs.difference(javascript_devs)
print(python_only)  # {'Alice', 'Charlie'}

# Symmetric difference (items in either set, but not both)
exclusive = python_devs ^ javascript_devs
print(exclusive)  # {'Alice', 'Charlie', 'Diana', 'Eve'}

# ★ SET COMPREHENSIONS
squares = {x**2 for x in range(10)}  # {0, 1, 4, 9, 16, 25, 36, 49, 64, 81}

🚀 Try It Yourself: Dedupe Emails

Given a list of emails with duplicates, use a set to remove duplicates and print the unique addresses. Bonus: write them to a CSV.

📓 Practice in Notebook

Open notebook-sessions/week2/session2_data_structures_group.ipynb and collaborate on the dedupe exercise and tuple use cases.

🎯 When to Use Tuples vs Sets

✓ Use tuples when:

Data shouldn't change (coordinates, RGB values, database records)
You need hashable values for dict keys or set members
Returning multiple values from functions
Memory efficiency matters

✓ Use sets when:

You need to remove duplicates from a collection
Fast membership testing is important (if item in collection)
You're doing mathematical set operations (union, intersection)
Order doesn't matter and items must be unique

📊 Quick Comparison Table

Feature	Tuple	Set
Syntax	`(1, 2, 3)`	`{1, 2, 3}`
Ordered	✓ Yes	✗ No
Mutable	✗ No	✓ Yes
Duplicates	✓ Allowed	✗ Auto-removed
Indexing	✓ `t[0]`	✗ No indexing

📁 File Handling: Reading and Writing Data

Files are how programs persist data beyond program execution. Python makes file operations simple with the open() function and context managers (with statements). You'll work with text files, CSV files, and learn best practices for error handling.

Reading Text Files

Python ✓ BEST PRACTICE

# ★ ALWAYS USE 'with' STATEMENT (Context Manager)
# It automatically closes the file, even if errors occur

# Read entire file as string
with open('data.txt', 'r') as file:
    content = file.read()
    print(content)

# Read file line by line (memory efficient for large files)
with open('data.txt', 'r') as file:
    for line in file:
        print(line.strip())  # .strip() removes newline characters

# Read all lines into a list
with open('data.txt', 'r') as file:
    lines = file.readlines()  # ['line1\n', 'line2\n', ...]
    clean_lines = [line.strip() for line in lines]

# File is automatically closed here, even if an exception occurred

💼 Career Connection: ETL Foundations

Reading and writing files underpins ETL pipelines across e‑commerce, finance, and healthcare. CSV is a common interchange format for analytics teams.

💡 File Modes Explained

'r' — Read (default). File must exist or raises FileNotFoundError.

'w' — Write. Creates new file or overwrites existing file completely.

'a' — Append. Creates new file or adds to end of existing file.

'r+' — Read and Write. File must exist.

'x' — Exclusive creation. Fails if file already exists (safety measure).

Writing Text Files

Python ✓ BEST PRACTICE

# ═══ WRITING TO FILES ═══

# Write mode (overwrites existing file)
with open('output.txt', 'w') as file:
    file.write("Hello, World!\n")
    file.write("This is line 2\n")

# Write multiple lines at once
lines = ["Line 1\n", "Line 2\n", "Line 3\n"]
with open('output.txt', 'w') as file:
    file.writelines(lines)

# Append mode (adds to existing file)
with open('log.txt', 'a') as file:
    file.write("New log entry\n")

# ★ REAL-WORLD PATTERN: Logging user actions
from datetime import datetime

def log_action(action):
    timestamp = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
    with open('user_actions.log', 'a') as file:
        file.write(f"[{timestamp}] {action}\n")

log_action("User logged in")
log_action("User viewed dashboard")

Working with CSV Files

Python ✓ BEST PRACTICE

import csv

# ═══ READING CSV FILES ═══

# Reading as list of rows
with open('data.csv', 'r') as file:
    csv_reader = csv.reader(file)
    header = next(csv_reader)  # First row is usually headers
    print(f"Columns: {header}")
    
    for row in csv_reader:
        print(row)  # Each row is a list: ['value1', 'value2', ...]

# Reading as dictionaries (RECOMMENDED)
with open('employees.csv', 'r') as file:
    csv_reader = csv.DictReader(file)
    
    for row in csv_reader:
        print(row['name'], row['department'])  # Access by column name!

# ═══ WRITING CSV FILES ═══

# Writing rows
data = [
    ['Name', 'Age', 'City'],
    ['Alice', 30, 'NYC'],
    ['Bob', 25, 'SF']
]

with open('output.csv', 'w', newline='') as file:
    csv_writer = csv.writer(file)
    csv_writer.writerows(data)  # Write all rows at once

# Writing dictionaries (RECOMMENDED)
employees = [
    {'name': 'Alice', 'dept': 'Engineering', 'salary': 90000},
    {'name': 'Bob', 'dept': 'Marketing', 'salary': 75000}
]

with open('employees.csv', 'w', newline='') as file:
    fieldnames = ['name', 'dept', 'salary']
    csv_writer = csv.DictWriter(file, fieldnames=fieldnames)
    
    csv_writer.writeheader()  # Write column headers
    csv_writer.writerows(employees)  # Write data rows

Error Handling with Files

Python ✓ BEST PRACTICE

# ★ ROBUST FILE HANDLING

from pathlib import Path

# Check if file exists before reading
file_path = Path('data.txt')

if file_path.exists():
    with open(file_path, 'r') as file:
        content = file.read()
else:
    print("File not found!")

# Handle exceptions
try:
    with open('data.txt', 'r') as file:
        content = file.read()
        data = int(content)  # May raise ValueError
except FileNotFoundError:
    print("Error: File doesn't exist")
except ValueError:
    print("Error: File content is not a valid number")
except Exception as e:
    print(f"Unexpected error: {e}")

# ★ REAL-WORLD PATTERN: Safe config file loader
import json

def load_config(filename, default_config):
    """Load JSON config file with fallback to defaults"""
    try:
        with open(filename, 'r') as file:
            return json.load(file)
    except FileNotFoundError:
        print(f"Config file not found. Using defaults.")
        return default_config
    except json.JSONDecodeError:
        print(f"Invalid JSON in config file. Using defaults.")
        return default_config

config = load_config('config.json', {'debug': False, 'port': 8000})

🎯 File Handling Best Practices

✓ Always use with statements: They ensure files are closed properly, even if errors occur.

✓ Use csv.DictReader/DictWriter: Working with dictionaries is more readable than index-based row access.

✓ Handle exceptions: Files might not exist, might have wrong permissions, or contain invalid data.

✓ Use Path objects: The pathlib module provides cross-platform path handling.

✓ Mind the newline parameter: Use newline='' when opening CSV files in write mode to avoid blank lines.

🏋️ Exercise: Journal Logger

Create a simple journal application:

Create a function add_entry(message) that appends a timestamped entry to journal.txt
Create a function read_journal() that displays all entries
Format: [2024-03-15 14:30:00] Your message here
Handle the case where the journal file doesn't exist yet

🌟 Challenge: CSV Data Processor

Build a sales report processor:

Create a CSV file with columns: date, product, quantity, price
Read the CSV and calculate total revenue (quantity × price)
Find the best-selling product by quantity
Write a summary report to a new CSV file

# Sample data structure
sales = [
    {'date': '2024-01-15', 'product': 'Widget', 'qty': 10, 'price': 9.99},
    # Add more rows...
]

⚠️ Common File Handling Mistakes

Forgetting to close files: Without with, you must manually call file.close(). Easy to forget!

Using 'w' instead of 'a': Write mode overwrites the entire file. Use append mode if you want to add to existing content.

Not handling FileNotFoundError: Always check if files exist or wrap in try/except.

Reading huge files with .read(): This loads the entire file into memory. For large files, iterate line by line.

📦 Modules: Reusing and Organizing Code

Modules are Python files containing functions, classes, and variables that you can import and reuse. Python's standard library has hundreds of built-in modules. You can also install third-party packages from PyPI (Python Package Index) using pip.

Importing Built-in Modules

Python ✓ BEST PRACTICE

# ═══ IMPORT PATTERNS ═══

# Import entire module
import math
print(math.sqrt(16))  # 4.0
print(math.pi)       # 3.141592653589793

# Import with alias (shorter name)
import datetime as dt
now = dt.datetime.now()
print(now)

# Import specific functions/classes
from random import randint, choice
print(randint(1, 10))         # Random integer between 1 and 10
print(choice(['a', 'b', 'c']))  # Random choice from list

# Import everything (NOT RECOMMENDED - pollutes namespace)
# from math import *  # Avoid this!

💡 Import Styles: When to Use Which

import module — Use for modules you'll call many times. Clear namespace: math.sqrt()

import module as alias — Use for long module names. Common aliases: import pandas as pd, import numpy as np

from module import name — Use when you only need specific functions. More concise: sqrt(16) instead of math.sqrt(16)

Avoid from module import *: Makes it unclear where functions come from. Can cause naming conflicts.

Useful Standard Library Modules

Python ✓ BEST PRACTICE

# ★ DATETIME - Working with dates and times
from datetime import datetime, timedelta

now = datetime.now()
print(now.strftime("%Y-%m-%d %H:%M:%S"))  # Format: 2024-03-15 14:30:00
tomorrow = now + timedelta(days=1)

# ★ OS - Operating system operations
import os

print(os.getcwd())           # Current working directory
print(os.listdir('.'))      # List files in directory
api_key = os.getenv('API_KEY')  # Get environment variable

# ★ JSON - Working with JSON data
import json

data = {'name': 'Alice', 'age': 30}
json_string = json.dumps(data)  # Python dict -> JSON string
back_to_dict = json.loads(json_string)  # JSON string -> Python dict

# Save/load JSON files
with open('data.json', 'w') as file:
    json.dump(data, file, indent=2)  # Pretty-print with indentation

with open('data.json', 'r') as file:
    loaded_data = json.load(file)

# ★ COLLECTIONS - Specialized data structures
from collections import Counter, defaultdict

# Counter - Count occurrences
words = ['apple', 'banana', 'apple', 'cherry', 'apple']
word_counts = Counter(words)
print(word_counts)  # Counter({'apple': 3, 'banana': 1, 'cherry': 1})

# defaultdict - Dict with default values
scores = defaultdict(list)  # Missing keys default to empty list
scores['Alice'].append(95)  # No KeyError!
scores['Alice'].append(87)

Installing and Using Third-Party Packages

Bash/Terminal ✓ BEST PRACTICE

# ═══ PIP: Python Package Installer ═══

# Install a package
pip install requests

# Install specific version
pip install pandas==2.0.0

# Upgrade a package
pip install --upgrade numpy

# Uninstall a package
pip uninstall matplotlib

# List installed packages
pip list

# Show package info
pip show pandas

# ═══ REQUIREMENTS.TXT - Project Dependencies ═══

# Save current environment to file
pip freeze > requirements.txt

# Install all packages from file
pip install -r requirements.txt

Python - Using Third-Party Packages ✓ BEST PRACTICE

# ★ REQUESTS - HTTP library (install with: pip install requests)
import requests

response = requests.get('https://api.github.com/users/python')
data = response.json()  # Parse JSON response
print(data['name'])  # Python

# ★ PANDAS - Data analysis (install with: pip install pandas)
import pandas as pd

df = pd.read_csv('data.csv')
print(df.head())  # First 5 rows

Virtual Environments: Isolated Project Dependencies

Bash/Terminal ✓ BEST PRACTICE

# ═══ WHY VIRTUAL ENVIRONMENTS? ═══
# - Isolate project dependencies
# - Avoid version conflicts between projects
# - Easy to recreate environments on other machines

# Create virtual environment
python -m venv venv

# Activate (Windows)
venv\Scripts\activate

# Activate (Mac/Linux)
source venv/bin/activate

# Now install packages - they go into venv, not system Python
pip install requests pandas

# Deactivate when done
deactivate

# ═══ BEST PRACTICE WORKFLOW ═══
# 1. Create venv for new project
# 2. Activate venv
# 3. Install packages
# 4. Save dependencies: pip freeze > requirements.txt
# 5. Commit requirements.txt to Git (NOT the venv folder!)

🎯 Module and Package Best Practices

✓ Use virtual environments: Every project should have its own venv to avoid dependency conflicts.

✓ Keep requirements.txt updated: Run pip freeze > requirements.txt when you add packages.

✓ Import at the top: All imports should be at the beginning of your file (PEP 8 style guide).

✓ Group imports: Standard library → Third-party → Your modules (separated by blank lines).

✓ Use .gitignore: Never commit venv folder or __pycache__ to Git.

📚 Creating Your Own Modules

Any Python file can be a module! Create my_utils.py with functions, then import with import my_utils.

Example file structure:

project/
├── main.py
├── my_utils.py
└── config.py

In my_utils.py:

def greet(name):
    return f"Hello, {name}!"

In main.py:

import my_utils

print(my_utils.greet('Alice'))  # Hello, Alice!

📋 Week 2 Cheat Sheet

Quick reference guide for all data structures and file operations covered this week.

Python Quick Reference

# ═══ LISTS ═══
fruits = ['apple', 'banana', 'cherry']
fruits.append('date')           # Add to end
fruits.insert(1, 'kiwi')        # Insert at index
fruits.remove('banana')         # Remove by value
popped = fruits.pop()            # Remove and return last
fruits.sort()                    # Sort in place
sorted_copy = sorted(fruits)    # Return new sorted list
print(fruits[0], fruits[-1])   # First and last item
print(fruits[1:3])            # Slice: items 1 to 2

# ═══ DICTIONARIES ═══
person = {'name': 'Alice', 'age': 30}
print(person['name'])           # Access (KeyError if missing)
print(person.get('job', 'N/A'))  # Safe access with default
person['city'] = 'NYC'          # Add/update
del person['age']             # Delete key

for key, val in person.items():
    print(key, val)              # Iterate key-value pairs

# ═══ TUPLES ═══
coords = (10, 20)                # Immutable
x, y = coords                     # Tuple unpacking
a, b = b, a                       # Swap values

# ═══ SETS ═══
tags = {'python', 'data', 'ml'}
tags.add('ai')                   # Add item
tags.discard('data')            # Remove (no error if missing)
unique = set([1, 2, 2, 3])       # Remove duplicates: {1, 2, 3}

set1 | set2                       # Union
set1 & set2                       # Intersection
set1 - set2                       # Difference

# ═══ FILE HANDLING ═══
# Read file
with open('data.txt', 'r') as file:
    content = file.read()         # Read all
    # lines = file.readlines()  # Read as list

# Write file
with open('output.txt', 'w') as file:
    file.write("Hello\n")        # Write string

# Append to file
with open('log.txt', 'a') as file:
    file.write("New entry\n")

# ═══ CSV FILES ═══
import csv

# Read CSV
with open('data.csv', 'r') as file:
    reader = csv.DictReader(file)
    for row in reader:
        print(row['name'], row['age'])

# Write CSV
with open('output.csv', 'w', newline='') as file:
    writer = csv.DictWriter(file, fieldnames=['name', 'age'])
    writer.writeheader()
    writer.writerow({'name': 'Alice', 'age': 30})

# ═══ MODULES ═══
import math                      # Import module
import datetime as dt            # Import with alias
from random import randint      # Import specific function

# Useful standard library modules:
# - datetime: Dates and times
# - os: Operating system operations
# - json: JSON encoding/decoding
# - csv: CSV file reading/writing
# - collections: Counter, defaultdict, etc.
# - pathlib: Cross-platform path operations

# ═══ PIP & VIRTUAL ENVIRONMENTS ═══
# Create virtual environment:
#   python -m venv venv
# Activate (Windows):
#   venv\Scripts\activate
# Activate (Mac/Linux):
#   source venv/bin/activate
# Install package:
#   pip install requests
# Save dependencies:
#   pip freeze > requirements.txt
# Install from requirements:
#   pip install -r requirements.txt

🎯 Key Takeaways: Week 2

Choose the Right Data Structure: Lists for ordered sequences, dicts for key-value lookups, tuples for immutable data, sets for uniqueness.
Dictionaries Are Powerful: Use .get() for safe access, .items() for iteration, comprehensions for transformations.
Always Use 'with' for Files: Context managers ensure files are closed properly, even if errors occur.
CSV Dictionaries > Lists: DictReader and DictWriter make CSV code more readable and maintainable.
Virtual Environments Are Essential: Isolate project dependencies to avoid version conflicts.
Import Wisely: Use import module for clarity, from module import name for convenience, avoid import *.
Error Handling Matters: Always handle FileNotFoundError and other exceptions when working with files.
Practice Makes Perfect: The best way to learn data structures is to use them in real projects.

🚀 Next Steps: Week 3 Preview

Next week, you'll level up your Python skills even further with:

• Object-Oriented Programming (OOP) — Create your own custom data types with classes

• Decorators — Add functionality to functions without modifying their code

• List Comprehensions — Transform data in elegant one-liners

• Closures — Create functions that "remember" values

Keep practicing this week's concepts—they're the foundation for everything that comes next!