Page 4
Semester 2: Data Structures and Algorithms
Abstract Data Types - introduction, bags, iterators, arrays, multi-dimensional arrays
Abstract Data Types
Introduction
Abstract Data Types (ADTs) are models for data structures that define the data type in terms of its behavior from the point of view of a user. They encapsulate data and provide operations for data manipulation while hiding implementation details.
Bags
A bag is an unordered collection of items allowing duplicate elements. It offers operations such as adding elements, removing elements, and checking for the presence of an element. Bags are useful in scenarios where the number of occurrences is significant.
Iterators
Iterators are objects that allow traversing through a collection of items in a uniform manner without exposing the underlying implementation. They provide a way to access elements sequentially, implementing methods such as next, hasNext, and remove.
Arrays
Arrays are a collection of elements identified by index or key, supporting random access. They have a fixed size and allow efficient access to their elements. Operations include insertion, deletion, and traversal. Arrays are fundamental building blocks in many data structures.
Multi-Dimensional Arrays
Multi-dimensional arrays are arrays of arrays, allowing the representation of data in multiple dimensions. Commonly used in mathematical computations, image processing, and scientific modeling, they support operations similar to one-dimensional arrays but include additional complexity regarding indexing.
Algorithm Analysis - experimental studies, asymptotic analysis, recursion examples
Algorithm Analysis
Experimental Studies
Experimental studies in algorithm analysis involve conducting empirical tests to evaluate the performance of algorithms. This is typically done by implementing algorithms in a programming language and measuring factors such as execution time and memory usage across different input sizes. These studies help in understanding algorithms' real-world performance and trade-offs.
Asymptotic Analysis
Asymptotic analysis focuses on the behavior of algorithms as the input size grows towards infinity. It provides a high-level understanding of an algorithm's efficiency. Key notations used include Big O notation for upper bounds, Omega notation for lower bounds, and Theta notation for tight bounds. This analysis helps in classifying algorithms based on their performance in terms of time and space complexities.
Recursion
Recursion is a method where a function calls itself to solve smaller instances of the same problem. Recursive algorithms can be analyzed for their time complexity and space complexity. It is important to establish base cases to prevent infinite recursion. Examples include calculating factorial numbers and the Fibonacci sequence, highlighting the differences between recursive and iterative implementations.
Stacks, Queues, Deques - linked list implementations, circular and doubly linked lists, tree traversals
Stacks, Queues, Deques - Linked List Implementations, Circular and Doubly Linked Lists, Tree Traversals
Stacks
A stack follows the Last In First Out (LIFO) principle. It supports two main operations: push (to add an element) and pop (to remove the top element). When implemented using a linked list, each element points to the next, with the top of the stack being the head of the list.
Queues
A queue operates on the First In First Out (FIFO) principle. It includes operations like enqueue (to add an element to the rear) and dequeue (to remove an element from the front). Linked list implementation enables dynamic sizing, allowing for efficient addition and removal of elements.
Deques
A double-ended queue (deque) allows insertion and removal of elements from both ends. Linked list implementations maintain pointers to both the front and rear nodes, enabling flexibility in operations.
Circular Linked Lists
In circular linked lists, the last node points back to the first node, creating a loop. This structure can be useful for implementing circular queues and allows for efficient traversal from any node.
Doubly Linked Lists
Doubly linked lists have nodes that contain pointers to both the next and previous nodes. This allows for more flexible traversal (both forwards and backwards) and makes certain operations easier, such as deletion and insertion at both ends.
Tree Traversals
Tree traversals are methods for visiting all the nodes in a tree data structure. Common types include in-order, pre-order, and post-order traversals. They can be implemented using a stack to facilitate depth-first search or using queues for breadth-first search.
Priority Queues - implementations, heaps, sorting
Priority Queues
Introduction to Priority Queues
Priority queues are abstract data types that allow elements to be stored with associated priorities. Elements can be removed based on their priority rather than their order of insertion.
Implementations of Priority Queues
Priority queues can be implemented using various data structures such as arrays, linked lists, and heaps. Each implementation has its own advantages and disadvantages in terms of performance and complexity.
Heaps as a Priority Queue Implementation
Heaps are a common way to implement priority queues. A binary heap is a complete binary tree that maintains the heap property—each parent node is less than or equal to its child nodes for a min-heap, or greater than or equal to its child nodes for a max-heap.
Heap Operations
Key operations for heaps include insertion, deletion, and heapify. Insertion inserts an element while maintaining the heap property, whereas deletion typically removes the root element and restructures the heap.
Sorting with Priority Queues
Priority queues can be utilized in sorting algorithms such as heapsort. In heapsort, elements are inserted into a heap and then removed in order of priority to produce a sorted list.
Applications of Priority Queues
Priority queues are widely used in various applications including scheduling tasks, managing resources in operating systems, and implementing Dijkstra's algorithm for shortest paths in graphs.
Maps, Hash Tables, Skip Lists - dictionaries, sorted maps, sets, multimaps
Maps, Hash Tables, Skip Lists - Dictionaries, Sorted Maps, Sets, Multimaps
Maps
Maps are a collection of key-value pairs. Each key is unique, and it maps to a specific value. Common operations include adding, removing, and retrieving values based on their keys. Maps can be implemented using various data structures, including hash tables and trees.
Hash Tables
Hash tables use a hash function to compute an index into an array of buckets or slots, from which the desired value can be found. This allows for average-case constant time complexity for search, insert, and delete operations. However, poor hash function design can lead to collisions, where two keys hash to the same index.
Skip Lists
Skip lists are a probabilistic alternative to balanced trees. They consist of multiple layers of linked lists where each higher layer acts as an express lane for the lower layers, allowing for efficient search, insertion, and deletion operations. The average time complexity is O(log n), making them a suitable option for dynamic sets of ordered elements.
Dictionaries
Dictionaries are abstract data types that define a collection of key-value pairs. They offer efficient retrieval and updating of values based on their associated keys. Python dictionaries are implemented as hash tables, while other programming languages may have different implementations.
Sorted Maps
Sorted maps, also known as ordered dictionaries, maintain their keys in a sorted order. This allows efficient range queries and ordered traversal. Common implementations include red-black trees or AVL trees.
Sets
Sets are collections that do not allow duplicate elements. They support operations like union, intersection, and difference. Implementation of sets varies from hash tables (for average-case O(1) operations) to binary search trees for ordered sets.
Multimaps
Multimaps are similar to maps, but they allow multiple values to be associated with a single key. This data structure is useful for cases where a single key can produce multiple related values. Implementations can vary from lists of values stored in a regular map to specialized tree structures.
Search Trees - binary search trees, AVL trees, splay trees, balanced trees
Search Trees
Binary Search Trees
A binary search tree (BST) is a tree data structure that maintains sorted order. Each node has at most two children, referred to as the left and right child. The left child contains nodes with keys less than the parent's key, while the right child contains nodes with keys greater than the parent's key. This property facilitates efficient searching, insertion, and deletion operations with average-case time complexity of O(log n).
AVL Trees
AVL trees are a type of self-balancing binary search tree. In an AVL tree, the difference in heights between the left and right subtrees of any node (known as the balance factor) is at most one. This ensures that the tree remains approximately balanced, which helps maintain O(log n) time complexity for search, insertion, and deletion operations. AVL trees perform rotations to maintain balance after insertions and deletions.
Splay Trees
Splay trees are a type of self-adjusting binary search tree. They use a mechanism called splaying to move accessed nodes closer to the root, which allows frequently accessed elements to be accessed faster. Splaying is performed through a series of tree rotations. While average-case performance is O(log n), the worst-case can be O(n) for a sequence of operations. However, splay trees can be beneficial when certain elements are accessed more frequently.
Balanced Trees
Balanced trees refer to a variety of trees that maintain a balanced structure to ensure efficient performance. Examples include AVL trees and Red-Black trees. Balanced trees provide logarithmic time complexity for search, insertion, and deletion operations by ensuring that the height of the tree remains logarithmic relative to the number of nodes. This ability to maintain balance is crucial in applications requiring frequent updates and queries.
Sorting and Selection - merge sort, quick sort, comparison
Sorting and Selection
Introduction to Sorting
Sorting is the process of arranging data in a specific order, usually ascending or descending. It is a fundamental operation in computer science and is used in various applications such as search algorithms, database management, and data analysis.
Merge Sort
Merge Sort is a divide-and-conquer algorithm. It divides the input array into two halves, recursively sorts them, and then merges the sorted halves. Merge sort has a time complexity of O(n log n) and is stable, meaning it maintains the relative order of equal elements.
Quick Sort
Quick Sort is also a divide-and-conquer algorithm. It selects a 'pivot' element and partitions the array into elements less than and greater than the pivot. It recursively sorts the partitions. Quick sort has an average time complexity of O(n log n) but can degrade to O(n^2) in the worst case. It is not stable.
Comparison between Merge Sort and Quick Sort
Both Merge Sort and Quick Sort have O(n log n) average time complexity, but their performance can differ based on the nature of the data. Merge sort is preferred for larger datasets and linked lists due to its stability. Quick sort is faster in practice for smaller datasets and is in-place.
Selection Algorithms
Selection algorithms are used to find the k-th smallest or largest element in a list. The most common examples include the QuickSelect algorithm, which is a modified version of Quick Sort and operates in average O(n) time.
Applications of Sorting and Selection
Sorting and selection algorithms are widely used in data analysis, searching applications, and in algorithms that require data ordering, such as binary search.
Graph Algorithms - graph data structures, traversals, shortest paths, minimum spanning trees
Graph Algorithms
Graph Data Structures
Graphs are fundamental data structures composed of vertices and edges. Common representations include adjacency lists and adjacency matrices. An adjacency list uses a list to store all adjacent vertices for each vertex, offering efficient space usage. An adjacency matrix uses a 2D array, providing O(1) time complexity for edge lookups, at the cost of increased space.
Graph Traversals
Graph traversal algorithms visit all vertices in a graph. The primary traversal techniques are Depth-First Search (DFS) and Breadth-First Search (BFS). DFS explores as far down a branch as possible before backtracking, utilizing a stack or recursion. BFS explores all neighbors before moving to the next level, using a queue.
Shortest Paths
Finding the shortest path in a graph can be accomplished with various algorithms. Dijkstra's algorithm is suitable for graphs with non-negative weights and uses a priority queue for efficiency. The Bellman-Ford algorithm can handle graphs with negative weights but is slower. The Floyd-Warshall algorithm computes shortest paths between all pairs of vertices.
Minimum Spanning Trees
A minimum spanning tree (MST) connects all vertices in a graph with the least total edge weight. Prim's algorithm grows the MST by adding the smallest edge from the tree to an outside vertex, using a priority queue. Kruskal's algorithm sorts all edges and adds them one by one to the MST, ensuring no cycles are formed by using union-find data structures.
