logo sapienza

Algorithms and Data Structures
Master degree in Statistical Methods and Applications
academic year 2022-2023

Teacher: Paolo Franciosa
6 CFU


Contents Program Teaching material
Lessons timetable Lessons topics and resources

Lessons will be in presence and will start on Monday, February 20th 2023,
with the following calendar:
Monday 10:00-12:00 in room VI (building CU002, fourth floor),
Wednesday 10:00-12:00 in room III (building CU002, third floor).


Contents

The course covers algorithmic programming techniques. The primary goal is to study algorithms and data structures that efficiently solve problems that would require very high computational resources (time or space) using trivial approaches.

We will address problems frequently arising in practical applications and real-world situations, such as sorting, searching, and graph traversal, as well as general design techniques including divide and conquer, greedy algorithms, and dynamic programming.

The course will cover the entire algorithm engineering cycle, addressing the design, theoretical analysis, implementation, and experimental evaluation of algorithms and data structures. The students will practice solving computational problems, designing new algorithms, and implementing efficient solutions in Python.

top


Program

Definition of algorithm, examples of algorithms.

Introduction to complexity. Computation models: Random Access Machine, Pointer Machine, Turing Machine.

Analisys of algorithm's complexity, Landau notations. Intrinsic complexity, lower bound for sorting by comparisons.

Divide and conquer, master theorem. Merge sort, Quick sort, k-selection. K-selection in linear time.

The concept of data structure and its definition as Abstract Data Type. Elementary data structures (lists, stacks, queues, trees).

The dictionary problem (10 hours) Binary search trees, red-black trees, B-Trees, hash tables.

Priority queues, union-find structures. The heap ADT, heap sort algorithm.

Graph algorithms. Depth first and breadth first visits, minimum spanning tree, single source shortest path tree (SSSP), minimum distances between all vertex pairs (APSP). Algorithms for maximum flow problems.

Implementation in Python language of the data structures and algorithms studied.

top


Texts

Introduction to the Design and Analysis of Algorithms
Anany Levitin, Pearson

resources are available on the web, an alternative textbook is
Algorithms Design, by Goodrich and Tamassia

A more comprehensive text is:
Introduction to Algorithms
T. Cormen, C. Leiserson, R. Rivest, and C. Stein, McGraw-Hill.

top

An introductory textbook on Python is:
Allen Downey,
Thinking in Python - How to Think Like a Computer Scientist,
second edition
pdf available at github under the GNU Free Documentation License

A more advanced textbook on Python, including implementation of algorithms on lists, trees, graphs is:
Basant Agarwal and Benjamin Baka,
Hands-On Data Structures and Algorithms with Python,
second edition
Packt Editor

Hand-written slides on some parts of the course (not covering all topics) can be found here

A portable version of Thonny (a simple and free Python IDE) for Windows, Mac or Linux, including a Python 3 interpreter, can be downloaded here.


Exams

The exam consists in a written test, a programming exercise, and possibly a colloquium.

Samples of exam tests are available in this zipped folder

top


Lessons 2022-2023

February 20, 2023 February 22, 2023 - cancelled
February 27, 2023 March 1, 2023
March 6, 2023 - cancelled March 8, 2023
March 13, 2023 March 15, 2023
March 20, 2023 March 22, 2023
March 27, 2023 March 29, 2023
April 3, 2023 April 5, 2023
April 10, 2023 - holydays April 12, 2023
April 17, 2023 April 19, 2023
April 24, 2023 - cancelled April 26, 2023
May 1, 2023 - holydays May 3, 2023
May 8, 2023 May 10, 2023
May 15, 2023 May 17, 2023
May 22, 2023 May 24, 2023 10:00 - 12:00)
May 24, 2023 12:00 - 14:00)

For any need, contact me at
paoloA_DOT_HEREfranciosaAN_"at"_HEREuniroma1A_DOT_HEREit

top


20 February 2023
Topics
  • Introduction, material -- book, software -- exams
  • Definitions: problem, instance, algorithm
  • The concepts of "resource", "executor", "operation"
  • Models of computation: RAM vs Pointer Machine
  • The Turing machine
  • Examples: find max, check duplicates, subset sum

back to lessons list

27 February 2023
Topics
  • Deriving the complexity of an algorithm: exact evaluation, best case, worst case
  • Examples: find maximum, check duplicates
  • Landau notations (O, Omega, Theta) to express complexity.
  • Complexity of an algorithm: upper bound, lower bound, exact bounds.
  • Containment among simple classes of functions.
  • How to derive complexity: code inspection.

back to lessons list

1 March 2023
Topics
  • Complexity of algorithms vs. (intrinsic) complexity of problems.
  • Lower bound for minimum/maximum selection, lower bound for comparison-based sorting algorithms
  • Sorting without comparing: examples for sorting n elements in a universe U in Θ(n+|U|) time and Θ(|U|) space

back to lessons list

8 March 2023
Topics

A brief introduction to Python 3 language (I) (see here for a basic textbook on Python)

  • using the interpreter
  • variables, expressions, assignments
  • type conversions
  • input() and print() functions
  • the if: ... else: ... statement
  • boolean expressions
  • using Thonny
Resources

Thonny IDE for Windows, Mac or Linux, including a Python interpreter, can be downloaded here.

Recording

Lesson recording from 2021-2022 is available at this Zoom link Passcode: 1^6D#mUW
this video is for personal use only, and is restricted to students of Algorithms and Data Structures course of Statistics Dept., Sapienza.

back to lessons list

13 March 2023
Topics

A brief introduction (II) to Python language (see here for a basic textbook on Python)

  • defining functions with def
  • lists, slicing on sequences
  • while statement
  • for statement: scanning positions vs. scanning elements in a list
Resources A portable version of Thonny for Windows, including a Python interpreter, can be downloaded here.
Just unzip the thonny-3.zip compressed folder and execute file thonny.exe
Recording Lesson recording from 2021-2022 is available at this Zoom link Passcode: 6$ZQjz4A
this video is for personal use only, and is restricted to students of Algorithms and Data Structures course of Statistics Dept., Sapienza.

back to lessons list

15 March 2023
Topics

Divide and Conquer. Recursive implementation of D&C algorithms. Identify base cases, build subinstances, recursive calls, combine subsolutions.

Complexity of & algorithms: recurrence equations.

Master Theorem

Integer multiplication, Karatsuba's algorithm

Matrix multiplication, Strassen algorithm

back to lessons list

20 March 2023
Topics

Sorting via Divide & Conquer: Merge Sort.

Complexity analysis of Merge sort

Quick sort, analysis of quick sort: best case, worst case (find the code of Quick sort here).

Randomized Quick Sort: analysis of average case.

Quick sort vs. Merge sort: stability, in-place

Code
  • KaratsubaProduct.py
  • mergeSort.py
  • back to lessons list

    22 March 2023
    Topics

    k-th element selection algorithm, complexity analysis (find the code of k-selection here).

    Worst case linear time k-th element selection algorithm, (Blum et al., 1973)

    The concept of Abstract Data Type. Information hiding.

    Introduction to OO programming in Python.

    Implementation of a simple class Number_set (find it in numModule.py), with an example of use (find it in useNumberSet.py).

    back to lessons list

    27 March 2023
    Topics

    Pointer based structures: linked lists vs. Python lists (arrays)

    Primitives on pointer based lists: insertion, deletion, search

    Python implementation of linked lists linkedListClass.py. Use of linked lists in useLinkedList.py

    back to lessons list

    29 March 2023
    Topics

    Last In First Out (LIFO) and First In First Out (FIFO) policies: Stacks and Queues

    Implementation of a stack by a linked list StackModule.py.

    Implementation of a queue (FIFO) by a linked list QueueModule.py, uses class headTailLinkedListClass.py.

    Implementation of arrays, map functions for uni-dimensional and bi-dimensional arrays.

    Dictionaries: time vs. space tradeoffs, trivial mapping from key to integers.

    Dictionaries: implementation via lists, sorted lists, arrays, sorted arrays.

    back to lessons list

    3 April 2023
    Topics

    Hash tables: definition, INSERT/SEARCH/DELETE operations. Hash functions, collisions. External chaining

    Resizing the hash table, doubling and halving the table. Amortized analysis

    back to lessons list

    5 April 2023
    Topics

    Hash tables: external chaining vs open probing. Sequential/linear probing, quadratic probing, double hashing.

    Managing deletions in open probing.

    Trees: basic definitions. Binary trees.

    Height of a binary tree: worst case and best case. Complete binary trees.

    back to lessons list

    12 April 2023
    Topics

    Binary tree visits: pre-order, in-order. post-order visits

    Binary search trees: definition, searching in a binary search tree.

    Max, min, predecessor and successor search in binary search trees.

    Insertion and deletion in a binary search tree.

    Python implementation of Binary Search Tree.
    Code available in bstFirst.py (some methods must be written).

    back to lessons list

    17 April 2023
    Topics

    Python implementation of Binary Search Tree with dummy leaves.
    Code available in bstDummy.py (with rank and print_tree methods)

    back to lessons list

    19 April 2023
    Topics

    Rotations in binary search trees

    Balanced trees: red-black trees. Definitions, insertions. Worst case performances.

    The external memory model of computation: block transfer, I/O efficiency

    back to lessons list

    26 April 2023
    Topics

    B-trees and B*-trees: definition, height bound, searching in a B*-tree

    Insertion and deletion in a B*-tree

    Priority queues. Primitives.

    Complexity of priority queues primitives on linked lists or balanced search trees

    Heaps: definition. Get_min and Insert operation

    back to lessons list

    3 May 2023
    Topics

    Heaps: delete_min operation, complexity of delete_min.

    Implementation of heaps by lists

    Implementation of heapify operation.

    Building a heap in O(n) time: make_heap

    An implementation is available in priority.py

    An example of use is in UsePriority.py

    Algorithm heap sort. Complexity analysis of heap sort. Implementation available in heapSort.py

    Animation of several sorting algorithms can be seen at https://www.toptal.com/developers/sorting-algorithms

    Graphs: definitions and basic properties

    • directed, undirected
    • sparse vs dense graphs

    back to lessons list

    8 May 2023
    Topics

    Graphs: definitions and basic properties

    • multigraph
    • subgraph, induced subgraph
    • path, cycle, clique, bipartite, complete bipartite
    • connected graph, connected components
    • trees, forests
    • number of edges in a tree
    • spanning trees
    • planar graph, embedding, faces
    • characterization of planar graphs (Kuratowsky theorem)
    • relation among the number of vertices, edges and faces of a (connected) planar graph

    back to lessons list

    10 May 2023
    Topics

    Graphs: definitions and basic properties

    • transitive closure of a relation: reachability (directed, undirected) as transitive closure of edges
    • directed graphs: strongly connected components
    • acyclic graphs, DAGs, DAG of strongly connected components
    • Hamiltonian paths and Eulerian tours/paths: definitions.
    • Existence of Eulerian tours/paths by vertices degrees.
    • Graphs traversals: Depth First Search (DFS) and Breadth First Search (BFS)

    Dictionaries in Python

    Representing graphs: adjacency matrix vs. adjacency lists.

    A Python class implementing graphs by dictionaries: basic version in basicGraph.py

    Exercise: add methods for deleting vertices or edges

    back to lessons list

    15 May 2023
    Topics

    Pseudocode of BFS and DFS

    DFS trees and BFS tree, BFS trees as Single Source Shortest Path (SSSP) trees for unweighted graphs.

    Python implementations of BFS and DFS visit: Graph.py

    back to lessons list

    17 May 2023
    Topics

    Minimum spanning tree and minimum spanning forest.

    The cycle rule for minimum spanning trees.

    Outline of Kruskal's algorithm. Complexity of Kruskal's algorithm: O(m log n) variant.

    Python implementation of Kruskal's algorithm for minimum spanning forest: Graph.py

    back to lessons list

    22 May 2023
    Topics

    Definition of cut. The cut rule for minimum spanning trees.

    Prim's algorithm.

    Code is available in Graph.py

    back to lessons list

    24 May 2023
    Topics

    Single source shortest path trees for non-negative weighted graphs: Dijkstra algorithm.

    Python implementation of Dijkstra's algorithm: updated version of Graph.py using a min priority queue that implements primitive decrease_priority PriorityQueue.py

    Complexity of Dijkstra's algorithm

    The All Pairs Shortest Paths problem: relation to matrix multiplication.

    back to lessons list

    -->