Place all of your submission documents in the ./lab1_submission_folder. You will submit a zip file of that directory in Canvas.
This lab explores the practical performance differences between adjacency matrix and adjacency list graph representations. Rather than just learning the theoretical complexities, you will:
- Generate road networks at various scales (100 to 100,000+ nodes)
- Implement algorithms using both representations
- Measure actual memory usage and runtime performance
- Experience where each representation succeeds and fails
- Understand why representation choice matters in real systems
If you are curious about the use of OO, classes, polymorphism, and overloading of methods in the pythonic example, and its corrallaries in C, you can read up on that in This File.
By completing this lab, you will:
- Understand O(V²) vs O(V+E) memory tradeoffs through direct experience
- See why sparse graphs (like road networks) favor adjacency lists
- Observe O(1) vs O(degree) edge lookup tradeoffs
- Understand why O(V) vs O(degree) neighbor iteration affects algorithm choice
- Develop intuition for choosing representations in real applications
# 1. Generate test networks
cd scripts
python3 generate_network.py --size tiny # 100 nodes - baseline
python3 generate_network.py --size small # 1,000 nodes
python3 generate_network.py --size medium # 10,000 nodes - matrix starts struggling
python3 generate_network.py --size large # 50,000 nodes - matrix ~10GB RAM!
# 2. Run experiments
cd ../python
python3 run_experiments.py
# 3. Or run individual benchmarks
python graph_representations.py data/medium_nodes.csv data/medium_edges.csv --benchmark| Size | Nodes | Matrix Memory | Expected Behavior |
|---|---|---|---|
| tiny | 100 | ~80 KB | No visible difference between representations |
| small | 1,000 | ~8 MB | Minimal difference, good for debugging |
| medium | 10,000 | ~800 MB | Matrix noticeably slower to load |
| large | 50,000 | ~20 GB | Matrix takes minutes to allocate, algorithms slow |
| huge | 100,000 | ~80 GB | Matrix will crash your 32GB machine! |
The "huge" size is intentionally designed to demonstrate the failure mode of O(V²) memory scaling.
cs3050-Lab-6/
├── data/ # Generated network files
│ ├── tiny_nodes.csv
│ ├── tiny_edges.csv
│ ├── small_nodes.csv
│ └── ...
├── lab1_submission/ # Generated network files
│ ├── .keep # Ensures an empty directory is cloned locally
│ ├── <<Your Submission in Markdown Format (you can copy that section of the README.md to start)>>
│ └── ...
├── scripts/
│ └── generate_network.py # Network data generator
├── python/
│ ├── graph_representations.py # Core implementations
│ └── run_experiments.py # Guided experiment runner
├── c/
│ └── (C implementations)
└── README.md
Adjacency List (AdjacencyListGraph):
# Memory: O(V + E)
# Stores: dict of node_id -> list of (neighbor_id, weight)
self.adj_list = {
0: [(1, 2.5), (3, 1.8)], # Node 0 connects to 1 and 3
1: [(0, 2.5), (2, 3.0)], # Node 1 connects to 0 and 2
...
}Adjacency Matrix (AdjacencyMatrixGraph):
# Memory: O(V²) regardless of edge count!
# Stores: 2D array where matrix[i][j] = weight if edge exists, 0 otherwise
self.matrix = [
[0.0, 2.5, 0.0, 1.8], # Node 0's edges
[2.5, 0.0, 3.0, 0.0], # Node 1's edges
...
]| Operation | Adjacency List | Adjacency Matrix |
|---|---|---|
| Memory | O(V + E) | O(V²) |
| Add edge | O(1) | O(1) |
| Check if edge exists | O(degree) | O(1) ← Matrix wins |
| Get all neighbors | O(degree) ← List wins | O(V) |
| Iterate all edges | O(E) | O(V²) |
Dijkstra's algorithm repeatedly calls get_neighbors():
- With adjacency list: O(degree) per call → O((V+E) log V) total
- With adjacency matrix: O(V) per call → O(V² log V) total
For a road network with V=50,000 nodes and average degree 6:
- List-based Dijkstra: ~300,000 neighbor lookups
- Matrix-based Dijkstra: ~2.5 billion cell scans!
Run experiments on progressively larger networks and record:
| Network | Nodes | List Memory | Matrix Memory | Ratio |
|---|---|---|---|---|
| tiny | ||||
| small | ||||
| medium | ||||
| large |
Questions:
- What is the relationship between node count and matrix memory?
- At what size does the matrix become impractical?
- Predict the memory for 200,000 nodes. Would it fit in 32GB?
The matrix has O(1) edge lookup. Measure the speedup:
# Test code is in benchmark_edge_queries() # In the C program
# Run 10,000 random edge existence checksQuestions:
- How much faster is matrix edge lookup?
- Why is list lookup slower? Trace through the code.
- In what applications would fast edge lookup matter?
Run Dijkstra's algorithm on both representations:
| Network | List Dijkstra | Matrix Dijkstra | Slowdown |
|---|---|---|---|
| tiny | |||
| small | |||
| medium |
Dijkstra's algorithm is in the c folder, and instructions for running it are in the README.md file in that directory.
Questions:
- Why is matrix-based Dijkstra slower despite O(1) edge lookup?
- Where in the algorithm does the slowdown occur?
- For what graph density would matrix be faster?
WARNING: Make sure all other work on your computer is saved before initiating this!
Try loading the "huge" network with matrix representation:
python3 scripts/generate_network.py --size huge
python3 -c "from python/graph_representations import load_graph; load_graph('data/huge_nodes.csv', 'data/huge_edges.csv', use_matrix=True)"Watch your system monitor. Document:
- Memory usage before loading
- Memory usage during matrix allocation
- What happens when memory is exhausted
Given these scenarios, choose the appropriate representation and justify:
- Google Maps routing: ~50 million road intersections, avg degree 4
- Social network analysis: 1 billion users, avg 200 friends
- Circuit analysis: 10,000 components, each connected to 3 others
- Dense communication matrix: 500 servers, all-to-all connections
As an additional challenge, implement a hybrid representation that:
- Uses adjacency list for sparse regions
- Uses matrix blocks for dense subgraphs
- Automatically chooses based on local density
class HybridGraph(Graph):
def __init__(self, density_threshold=0.1):
# Your implementation here
passPlace all of your submission documents in the ./lab1_submission_folder. You will submit a zip file of that directory in Canvas.
- Experimental data from all exercises (CSV or screenshots)
- Written answers to all questions (~1-2 paragraphs each)
- Analysis of when each representation is appropriate
- Code for any modifications you made
- Reflection on what surprised you in the experiments
- Dijkstra's Algorithm
- A* Search
- Graph Data Structures
- SNAP Datasets - Real-world network data
"Matrix would require X GB": Working as intended! This demonstrates why O(V²) doesn't scale.
Out of memory: Kill the process with Ctrl+C. Use --size medium or smaller.
Slow matrix allocation: For 50,000 nodes, allocation takes 30-60 seconds. Be patient.
Different results each run: Shortest path benchmarks use random endpoints. Set random.seed(42) for reproducibility.