Main Navigation Home Syllabus Integrity Guides Lectures

Appearance

Sidebar Navigation

Introduction

Course Overview

Basic Unix Commands

C++ Compilation is Linux

Memory, Pointers, & Testing

Memory Management

More with Pointers

Testing and Unit Tests

C++ Programming

C++ Object-Oriented Programming

Operator Overloading

C++ Templates

Performance Analysis

Analytical Analysis

Empirical Analysis

Linked Lists

Singly-Linked Lists

Debugging in C++

Doubly-Linked Lists

Searching and Sorting

Search

Quadratic Sorting

Quick Sort

Merge Sort

Stacks and Queues

Stacks

Queues

Depth-First and Breadth-First Search

Project 2

Binary Trees

BST: Insertion

BST: Intro to Traversal

BST: Deletion

BST: Generic Traversal

Heaps & Heapsort

Standard Template Library

Containers & Iterators

Algorithms

More Data Structures

Hash Tables via Chaining

Hash Tables via Open-Addressing

Project 3: Overview

Priority Queues

B-trees

Graphs via Adjacency Matrices

Graphs via Adjacency Lists

Quadtrees

Wrapping Up

Project 3 Optimizations

Review

On this page

Project 2

Project 2 integrates file I/O, string parsing, validation, and traversal into one cohesive system. You will build a simplified HTML parser that extracts tags and links, verify structural correctness using a proper stack discipline (not simple counting), and implement a DFS or BFS crawler that counts unique reachable pages while avoiding duplicates and missing files.

The most important advice is to

design your data structures before coding,
store parsed results by filename so you never reparse unnecessarily,
separate parsing from balance checking,
track visited pages during crawling to prevent infinite recursion,
and thoroughly test edge cases (especially malformed nesting and broken links) with your own HTML files rather than relying only on the provided examples.

Overview

Data Structure Design

Your parser must store data so that:

isBalanced() does not reparse the file
visitPageAmount() can access links efficiently
Files are not reparsed unnecessarily

Final Implementation Checklist

Read file character-by-character

Extract tags correctly

Handle <a href="...">...</a> carefully

Store parsed data by filename

Implement stack-based balance check

Implement DFS or BFS for crawling

Avoid double parsing

Avoid double counting

Handle missing files correctly

Create additional test HTML files

Last updated:

Pager

Previous pageDepth-First and Breadth-First Search

Next pageBST: Insertion

Project 2 ​

Overview ​

Data Structure Design ​

Final Implementation Checklist ​

Project 2

Overview

Data Structure Design

Final Implementation Checklist