Google Pagerank

The Google search engine uses an algorithm called `Pagerank' to decide on the importance of web pages, that is, how high to rank them in search results. This algorithm, in its original form, can be interpreted in two different ways. Firstly, it emulates the behavior of a computer user clicking on links. Secondly, it solves a mathematical linear algebra problem.

In this project, the student first builds up a simulated internet, and models the behavior of a user randomly clicking on links. Secondly, an implementation is made that is closer to linear algebra concept. Both models are used to explore the Pagerank algorithm.

This project can be done by one or two undergraduate students, or AP high schoolers at the end of a first or semester programming course. The code is written completely from scratch, and requires no input data.

There is opportunity for learning some non-standard mathematics.

• Summary. Explore the Pagerank algorithm for ranking web search results. • Topics. Pointers, arrays, matrices, classes. There is a possibility of exploring sparse matrices • Audience. Undergraduate or AP high school • Difficulty. Moderate to high • Strengths. Explores a phenomenon that is in everyone's experience • Weaknesses. The two parts (pointer-based, matrix-based) are independent of each other. The first part does not model a computation that is used in practice. • Dependencies. No dependencies, though students have the opportunity for exploring some mathematical topics. • Variants. The matrix approach can be extended to sparse matrices.Assignment