days since this article was written, please be aware of its timeliness
Preface
Although I know I’ve worked hard and even joined big companies, I’ve always felt weak in the fundamentals of computer science. Since I don’t have a formal academic background in this field, I’ve always hoped to bridge the gap with my peers who do. Among the three big dreams of a programmer (languages/algorithms/graphics), data structures and algorithms have been a lingering insecurity for me. During my free time, I often browse tech communities and get drawn in by clickbait titles about data structures and algorithms, only to find the content hollow and unsubstantial. Finally, I made up my mind to seriously study algorithms (the basics).
This article is the product of my learning journey—a summary and reference while reading Learning JavaScript Data Structures and Algorithms (2nd Edition). I started with data structures, then moved on to various algorithms related to them. Most implementations use ES5 methods, with a few using ES6. The focus is on achieving functionality without considering tangential issues like execution overhead (e.g., adding methods to each instance instead of using the prototype chain).
Stack
Basic Concepts and Methods of Stacks
A stack is an ordered collection that follows the Last-In-First-Out (LIFO) principle. Newly added or to-be-removed elements are stored at one end of the stack, called the top, while the other end is called the bottom. In a stack, new elements are close to the top, and old elements are near the bottom. A classic analogy is a stack of books. The stack data structure includes the following methods:
push(element): Adds one or more elements.pop(): Removes the top element and returns it.peek(): Returns the top element without modifying the stack.isEmpty(): Checks if the stack is empty.clear(): Clears the stack.size(): Returns the number of elements in the stack.
JavaScript Implementation of a Stack
Applications of Stacks
Queue
Basic Concepts and Methods of Queues
A queue is an ordered collection of items that follows the First-In-First-Out (FIFO) principle. New elements are added at the tail, and elements are removed from the head. The newest element must be placed at the end of the queue. The queue data structure includes the following methods:
enqueue(element): Adds one or more elements to the tail of the queue.dequeue(): Removes the first element (the oldest added) and returns it.front(): Returns the first element (the oldest added) without modifying the queue.isEmpty(): Checks if the queue is empty.size(): Returns the number of elements in the queue.
JavaScript Implementation of a Queue
Linked List
Basic Concepts and Methods of Linked Lists
Each item in a linked list not only stores its own value but also a pointer to the next item—this is called a singly linked list. If each item also stores a pointer to the previous item (except for the first/last items, where the previous/next pointers are null), it’s called a doubly linked list. If a doubly linked list connects its head and tail, it’s called a doubly circular linked list. Here’s an illustration (the arrows represent pointers, which are part of the item at the tail of the arrow):
- Singly Linked List
graph RL A(Tail) --> B(第三项) --> C(第二项) --> D(Head) ``` 2. Doubly Linked List ```mermaid graph LR A(Head) --> B(第二项) --> A B --> C(第三项) --> B C --> D(Tail) --> C ``` 3. Circular Linked List ```mermaid graph LR A(Head) --> B(第二项) --> A B --> C(第三项) --> B C --> D(Tail) --> C A --> D --> A ``` Linked lists have the following methods: 1. `append(element)` Adds a new item to the end of the linked list 2. `insert(position, element)` Inserts a new item at a specific position in the linked list 3. `remove(element)` Removes an item from the linked list 4. `indexOf(element)` Returns the index of the element in the linked list, or -1 if not found 5. `removeAt(position)` Removes the item at the specified position in the linked list 6. `isEmpty()` Checks if the linked list is empty 7. `size()` Returns the number of items in the linked list ### JavaScript Implementation of Linked Lists 1. [Singly Linked List ](https://static.xheldon.cn/example-code/2019/LinkedList.js){:data-open=""} 2. [Doubly Linked List ](https://static.xheldon.cn/example-code/2019/DoublyLinkedList.js){:data-open=""} 3. [Circular Linked List ](https://static.xheldon.cn/example-code/2019/CircularLinkedList.js){:data-open=""} ## Sets ### Basic Concepts and Methods of Sets A set is a collection of unordered and unique (non-repeating) items. Sets have the following methods: 1. `add(value)` Adds a new item to the set 2. `remove(value)` Removes an item from the set 3. `has(value)` Checks if an item exists in the set 4. `clear()` Empties the set 5. `size()` Returns the size of the set 6. `values()` Returns an array containing all values in the set 7. `union(set)` Returns the union with another set 8. `intersection(set)` Returns the intersection with another set 9. `difference(set)` Returns the difference (current set minus the specified set) 10. `subset(set)` Checks if one set is a subset of another ### JavaScript Implementation of Sets Typically implemented using objects or ES6's Set. Details omitted. ## Dictionaries ### Basic Concepts of Dictionaries A set represents a group of distinct elements, stored as `{value: value}` in objects. In dictionaries, the format is `{key: value}`, also known as `maps`. Dictionaries usually have the following methods: 1. `set(key, value)` Adds an item to the dictionary 2. `delete(key)` Removes the item with the specified key 3. `has(key)` Checks if an item with the key exists in the dictionary 4. `get(key)` Finds and returns the item with the specified key 5. `clear()` Empties the dictionary 6. `size()` Returns the size of the dictionary 7. `keys()` Returns an array of all keys in the dictionary 8. `values()` Returns an array of all values in the dictionary ### JavaScript Implementation of Dictionaries Omitted. ## Hash Tables A hash table, or HashTable (array-based) / HashMap (object-based), is a hashed implementation of the dictionary data structure. If a dictionary needs to find an item, the worst-case scenario requires traversing the entire structure. With a hash table and its accompanying hashing algorithm, the exact location of the value is known, enabling fast retrieval. The purpose of a hash function is to take a key and return the address of the value in the hash table. ### Uses of Hash Tables You might ask, why do we need hash tables? In JavaScript, directly using objects to achieve key-value mapping already offers O(1) speed, right? That's correct, but consider what to do if there are duplicate keys. For example, if you want to build a roster to record information about different classmates, how would you handle classmates with the same name? This is where an algorithm is needed to ensure that each classmate's name, when used as input, produces a unique output, enabling a `one-to-one correspondence` and thus achieving an O(1) algorithm. This is called perfect hashing, known in mathematics as a total injective function. However, storing millions of data entries this way would make the `Object` extremely large, leading to performance issues. Hence, a hashing algorithm is needed to balance time and space. Taking the roster example again, if there are classmates with the same name, the name is still used as the key, but the value is no longer just one classmate's information—it becomes an array (or linked list) of classmates' information. Given a classmate's name, if the corresponding value is found to be an array, the array is traversed until the exact classmate's information is found. This balances time (longer than the previous O(1) ❌) and space (smaller than the space occupied by the previous Object ✅). A good hashing algorithm should ensure that the arrays under each key are roughly similar in size. ### Basic Methods of Hash Tables 1. `put(key, value)` – Add an item to the hash table. 2. `remove(key)` – Remove an item from the hash table based on the key. 3. `get(key)` – Retrieve an item based on the key. ### The Simplest Hash Function > Taking strings as input (ignoring identical strings for now) and using an array as the data structure, the key can serve as the index. ```js let loseloseHashCode = function (key) { let hash = 0; for (let i = 0; i < key.length; i++) { hash += key.charCodeAt(i); } return hash % 37; };
Resolving Collisions in Hash Tables
From the above hashing algorithm, it’s clear that the computed keys can collide, meaning the same key in the array might correspond to different values, leading to overwrites. Thus, collisions need to be resolved. There are three approaches:
Separate Chaining– If the hashing algorithm produces the same key, the key no longer points directly to the value but to an array/linked list, transforming the problem into array/linked list insertion/search.Linear Probing– If the hashing algorithm produces the same key, the key is automatically incremented by 1 to attempt storage again. If the key still collides, it continues incrementing until an empty slot is found (suitable for hash tables with numeric keys, like arrays). Some excellent implementations for numeric keys can be found here.Double Hashing– If the hashing algorithm produces the same key, another (or the same) hash function is applied to the collision to generate a new key.
JavaScript Implementations of Hash Tables
Trees
Tree Terminology
First, look at the structure diagram:
graph TD;
11---7;
11---15;
7---5;
7---9;
5---3;
5---6;
9---8;
9---10;
15---13;
15---20;
13---12;
13---14;
20---18;
20---25;
Where:
- 11 is at level 0, 7/15 at level 1, 5/9/13/20 at level 2, and the rest at level 3.
- 11 is called the root node, and the others are child nodes. Nodes with at least one child are called internal nodes, while nodes without children are called external nodes.
- Nodes have a depth property. For example, node 3 has three ancestor elements, so its depth is 3.
- The height of the tree depends on the maximum depth among all nodes.
- This is a complete binary tree because, except for the bottom layer, all nodes have two children.
Note: The same data can form several tree structures, not necessarily a unique representation. This is why the
AVL treeexists—it aims to construct the data into as complete a tree as possible.
Binary Trees and Binary Search Trees
A binary tree node can have at most two child nodes: one left child node and one right child node. A Binary Search Tree (BST) is a type of binary tree, but it only allows storing values smaller than the node on the left side of any node and values greater than or equal to the node on the right side. The diagram shows a binary search tree with the following methods:
insert(key)Insert a new nodesearch(key)Search for a node or return nullinOrderTraverseIn-order traversal of all nodespreOrderTraversePre-order traversal of all nodespostOrderTraversePost-order traversal of all nodesminReturn the smallest value in the treemaxReturn the largest value in the treeremove(key)Remove an item from the tree
Tree Traversal Diagram:
-
In-order traversal—Commonly used for sorting operations on the tree
graph TD; 11---7; 11---15; 7---5; 7---9; 5---3; 5---6; 9---8; 9---10; 15---13; 15---20; 13---12; 13---14; 20---18; 20---25; 3-.->5 5-.->6; 6-.->7; 7-.->8; 8-.->9; 9-.->10; 10-.->11; 11-.->12; 12-.->13; 13-.->14; 14-.->15; 15-.->18; 18-.->20; 20-.->25; ``` Traversal order: 3 5 6 7 8 9 10 11 12 13 14 15 18 20 25 2. **Pre-order traversal**—Often used to print a structured document. The traversal order in the example diagram is: 11 7 5 3 6 9 8 10 15 13 12 14 20 18 25 3. **Post-order traversal**—Applied to calculate the total file size of a directory and its subdirectories. The traversal order in the example diagram is: 3 6 5 8 10 9 7 12 14 13 18 25 20 15 11 ### Adelson-Velskii-Landi Tree (AVL Tree) An AVL tree is a self-balancing tree. A binary tree may sometimes result in one subtree being too deep while others are too shallow, as shown below: ```mermaid graph TD; 11---7; 11---15; 7---5; 7---9; 5---3; 5---6; 9---8; 9---10; 15---13; 15---20; 13---12; 13---14; 20---18; 20---25; 25---27 27---30 ``` This tree has an overly deep right side, which may cause performance issues when searching, adding, or deleting nodes. Hence, the need for an AVL self-balancing tree. An AVL tree is a self-balancing binary search tree where the height difference between the left and right subtrees of any node is at most 1. This means the tree attempts to remain as close to a complete tree as possible when inserting or removing nodes. #### Balance Factor and Calculation in AVL Trees Inserting and deleting nodes in an AVL tree is identical to a BST. The difference lies in checking the `balance factor` during insertion/deletion and applying self-balancing if necessary. In an AVL tree, the difference between the height of the right subtree (hr) and the left subtree (hl) must be calculated for each node. If this value (hr - hl) is not one of 0, 1, or -1, the AVL tree needs balancing. This is the concept of the balance factor. The following three trees are all balanced: ```mermaid graph TD; subgraph 第三种 A((+1))---B(( -1)) A---C(( -1)) B---E((0)) C---D((+1)) C---F((0)) D---H((0)) end subgraph 第二种 I((+1))---G((0)) I---K(( -1)) K---L((0)) end subgraph 第一种 M(( -1))---N((0)) end ``` #### Rotations in AVL Trees When inserting a node into an AVL tree, two types of balancing operations can be performed: single rotation or double rotation, corresponding to four scenarios: 1. **Right-Right (RR)**: Single rotation to the left 2. **Left-Left (LL)**: Single rotation to the right 3. **Left-Right (LR)**: Double rotation to the right 4. **Right-Left (RL)**: Double rotation to the left **Right-Right (RR)**: Diagram of a single rotation to the left (Mermaid cannot draw this as a binary tree, so 80 is on the left and 50 on the right): ```mermaid graph TD; subgraph 之后 G((70))---H((50)) G---I((80)) I---J((90)) H---K((30)) H---L((60)) end subgraph 之前 A((50))---B((30)) A---C((70)) C---D((60)) C---E((80)) E---F((90)) end ``` **Left-Left (LL)**: Diagram of a single rotation to the right (Mermaid cannot draw this as a binary tree, so 50 is on the left and 10 on the right): ```mermaid graph TD; subgraph 之后 G((30))---H((10)) G---I((50)) I---J((40)) I---K((70)) H---L((5)) end subgraph 之前 A((50))---C((30)) A---B((70)) C---D((10)) C---E((40)) D---F((5)) end ``` **Left-Right (LR)**: Diagram of a double rotation to the right (essentially performing an RR rotation followed by an LL rotation): ```mermaid graph TD; subgraph 之后 G((40))---H((30)) G---I((50)) H---J((10)) H---K((35)) J---L((70)) end subgraph 之前 A((50))---C((30)) A---B((70)) C---D((10)) C---E((40)) E---F((35)) end ``` **Right-Left (RL)**: Diagram of a double rotation to the left (essentially performing an LL rotation followed by an RR rotation; Mermaid cannot draw this as a binary tree, so 80 is on the left and 70 on the right): ```mermaid graph TD; subgraph 之后 G((72))---I((70)) G---H((80)) H---J((75)) H---K((90)) I---L((50)) end subgraph 之前 A((70))---B((50)) A---C((80)) C---D((72)) C---E((90)) D---F((75)) end
Although AVL trees are self-balancing, their performance for node insertion or removal is not always optimal. A better alternative is the red-black tree, ](http://goo.gl/OxED8K) which can efficiently traverse its nodes in order.
JavaScript Implementations of Basic Tree Methods
- ](https://static.xheldon.cn/example-code/2019/BinarySearchTree.js) Binary Tree
- ](https://static.xheldon.cn/example-code/2019/AVLTree.js) AVL Tree
- ](https://static.xheldon.cn/example-code/2019/RedBlackTree.js) Red-Black Tree
- ](http://goo.gl/SFlhW6) Heap Tree
Graphs
Basic Concepts of Graphs
A graph is an abstract model of a network structure. It represents a set of nodes (or vertices) connected by edges. Any binary relationship can be expressed using graphs. They can be employed to solve problems like finding the shortest path between two points.
Mathematical Definition of Graphs
A graph G = (V, E) consists of the following elements:
- V: A set of vertices
- E: A set of edges connecting the vertices in V
Below is an illustration of a graph:
graph TD;
A---B
B---E
B---F
E---I
A---C
A---D
C---D
C---G
D---G
D---H
Graph Terminology
In the above graph:
- Vertices connected by an edge are called
adjacent vertices. - The
degreeof a vertex is the number of its adjacent vertices. In the graph above, A is adjacent to three other vertices, so its degree is 3; E has a degree of 2. - A
pathis a continuous sequence of vertices v1, v2, …, vk, where vi and v(i+1) are adjacent. The graph above includes paths like ABEI and ACDG. - A
simple pathdoes not contain repeated vertices, such as ADG. Excluding the last vertex (since it’s the same as the first), acycleis also a simple path, like ADCA (where the last vertex returns to A). - If a graph contains no cycles, it is called
acyclic. If there is a path between every pair of vertices, the graph isconnected. - A graph can be
directed(edges have direction) orundirected(edges have no direction). The graph above is anundirected graph. - If there is a bidirectional path between every pair of vertices, the graph is
strongly connected. - Graphs can also be
unweightedorweighted, where edges in a weighted graph are assigned values.
Graph Representation
Adjacency Matrix
The most common implementation of a graph is the adjacency matrix. Each node is associated with an integer, which serves as an index for an array. A two-dimensional array represents connections between vertices. If the node with index i is adjacent to the node with index j, then array[i][j] === 1; otherwise, array[i][j] === 0. The graph above can be represented by the following adjacency matrix:
| A | B | C | D | E | F | G | H | I | |
|---|---|---|---|---|---|---|---|---|---|
| A | 0 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 |
| B | 1 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 |
| C | 1 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 |
| D | 1 | 0 | 1 | 0 | 0 | 0 | 1 | 1 | 0 |
| E | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
| F | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| G | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 |
| H | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |
| I | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 |
A graph that is not strongly connected is called a sparse graph. If represented by an adjacency matrix, the matrix will contain many zeros, meaning we waste computer storage space representing edges that don’t actually exist.
Adjacency List
An adjacency list consists of a list of adjacent vertices for each vertex in the graph. There are several ways to represent this data structure. Lists (arrays), linked lists, hash tables, or dictionaries can be used to represent the adjacent vertex lists. The above graph can be represented by the following adjacency list:
| A | B | C | D | |
|---|---|---|---|---|
| B | A | E | F | |
| C | A | D | G | |
| D | A | C | G | H |
| E | B | I | ||
| F | B | |||
| G | C | D | ||
| H | D | |||
| I | E |
Incidence Matrix (Omitted)
Graph-Related Methods
addVertex(v)Add a new vertexaddEdge(v, w)Add a new edge
Graph Traversal
Graph traversal can be used to find specific vertices or paths between two vertices, check if the graph is connected, or determine if the graph contains cycles. The basic idea of graph traversal algorithms is to track each node when it is first visited and keep track of which nodes have not been fully explored (i.e., they still have unexplored child nodes). For both search (traversal) modes described below, the first vertex to be visited must be explicitly specified. Fully exploring a vertex requires examining every edge of that vertex. For each edge connected to an unvisited vertex, mark it as discovered and add it to the list of vertices to be visited. To ensure algorithm efficiency, each vertex must be visited at most twice. In a connected graph, every edge and vertex will be visited. The visitation status of vertices can be represented by three states: unvisited / visited but not fully explored / fully explored.
Breadth-First Search (BFS) for Graphs
The data structure for the list of vertices to be visited is a queue. Vertices are stored in the queue, and the first vertex enqueued is the first to be explored.
Breadth-First Search is used to find the shortest path (implementation provided below). The Dijkstra's algorithm solves the single-source shortest path problem (greedy algorithm). The Bellman-Ford algorithm solves the single-source shortest path problem with negative edge weights. The A* search algorithm solves the shortest path problem between a single pair of vertices, using heuristics to speed up the search process. The Floyd-Warshall algorithm solves the all-pairs shortest path problem (dynamic programming algorithm).
Depth-First Search (DFS) for Graphs
The data structure for the list of vertices to be visited is a stack. Vertices are stored in the stack and explored along a path, visiting new adjacent vertices as they are encountered. When we need to determine the execution order of certain tasks or steps, this is called topological sorting (topsort/toposort). We can use Depth-First Search to perform topological sorting. Topological sorting can only be applied to Directed Acyclic Graphs (DAGs), with a specific implementation provided below.
Minimum Spanning Tree (MST)
The Minimum Spanning Tree is a common problem in network design. For example, if your company has several offices and you want to connect them with telephone lines at the lowest cost to save money, what is the best approach? This can also be applied to the island-bridge problem. Imagine you want to build bridges between n islands and connect all islands at the lowest cost. Both problems can be solved using MST algorithms:
-
Prim’s Algorithm
A greedy algorithm used to solve the MST problem for weighted undirected connected graphs. It finds a subset of edges that forms a tree including all vertices, with the minimum possible total edge weight. -
Kruskal’s Algorithm
Another greedy algorithm for solving the MST problem in weighted undirected connected graphs.
JavaScript Implementation of Graphs
Sorting and Searching Algorithms
Sorting Algorithms
Bubble SortComplexity O(n2): Two nested loops; swaps values in the inner loop if the condition is met.Selection SortComplexity O(n2): Two nested loops; the inner loop finds the minimum value and places it in the first position, then finds the second smallest value and places it in the second position, and so on.Insertion SortComplexity O(n2): Starts from the beginning, assuming the preceding elements are already sorted, then determines the position of the current value in the sorted array and inserts it, continuing to the next value.Merge SortComplexity O(nlogn): Recursive approach; divides the array into smaller arrays, then sorts them pairwise (like merging two sorted decks of cards) until fully sorted.Quick SortComplexity O(nlogn): Selects a pivot and partitions the data into two halves—values greater than the pivot go to the right, and smaller values go to the left—until the left pointer exceeds the right pointer.Heap SortTreats the array as a binary tree for sorting, with the following rules:- Index 0 is the root of the tree.
- For any node N (except the root), the parent node is N/2.
- The left child of node L is 2 * L.
- The right child of node R is 2 * R + 1.
- For example, the array [3, 5, 1, 6, 4, 7, 2] can be visualized as the following tree:
graph TD; A((3))---B((5)) A---C((1)) B---D((6)) B---E((4)) C---F((7)) C---G((2))
- Counting Sort
- Bucket Sort
- Radix Sort (Distribution Sort)
Searching Algorithms
- Sequential Search
- Binary Search
JavaScript Implementations of Sorting/Searching Algorithms
Algorithm Patterns
Dynamic Programming
Knapsack Problem: Given a set of items, each with a value and weight, the goal is to determine the subset of items with the maximum total value. The constraint is that the total weight must not exceed the knapsack’s capacity.Longest Common Subsequence: Find the longest subsequence common to a set of sequences (obtained by deleting elements from another sequence without changing the order of the remaining elements).Matrix Chain Multiplication: Given a sequence of matrices, the goal is to find the most efficient way to multiply them (minimizing the number of operations). The multiplication is not performed; the solution lies in determining the optimal order of multiplication.Coin Change: Given coins of denominations d1, d2…dn and a target amount, determine the number of ways to make change for that amount.All-Pairs Shortest Path in Graphs: For all vertex pairs (u, v), find the shortest path from vertex u to vertex v (previously solvable by the Floyd-Warshall algorithm).
JavaScript Implementations of Algorithm Patterns
- Recursion
- Minimum Coin Change DP Algorithm
- Minimum Coin Change Greedy Algorithm
- Knapsack Problem DP Algorithm
- Knapsack Problem Recursive Algorithm
- Knapsack Problem Greedy Algorithm
- Longest Common Subsequence DP Algorithm
- Longest Common Subsequence Recursive Algorithm
- Matrix Chain Multiplication DP Algorithm
- Matrix Chain Multiplication Recursive Algorithm
Overview of NP-Complete Theory
Generally, if an algorithm’s complexity is O(nk), where k is a constant, the algorithm is considered efficient—this is a polynomial-time algorithm. For a given problem, if a polynomial-time algorithm exists, it is classified as P (polynomial).
There is another class called NP (nondeterministic polynomial) algorithms. If a problem’s solution can be verified in polynomial time, it is classified as NP.
If a problem has a polynomial-time algorithm, its solution can naturally be verified in polynomial time, so all P problems are NP. However, whether P = NP remains unknown.
The most challenging subset of NP problems is NP-complete problems, which satisfy two conditions:
- They are NP problems, meaning their solutions can be verified in polynomial time, but no polynomial-time algorithm has been found yet.
- All NP problems can be reduced to them in polynomial time.
Another class, NP-hard problems, only needs to satisfy the second condition of NP-complete problems. Thus, NP-complete problems are a subset of NP-hard problems.
Examples of NP-hard problems that are not NP-complete include the Halting Problem and the Boolean Satisfiability Problem (SAT). Examples of NP-complete problems include the Subset Sum Problem, Traveling Salesman Problem, and Vertex Cover Problem.
Finally, some problems are unsolvable, yet approximate solutions can still be found within acceptable time frames. Heuristic algorithms are one such approach. While they may not yield optimal solutions, they are often sufficient.
Examples of heuristic algorithms include Local Search, Genetic Algorithms, Heuristic Navigation, and Machine Learning. For more, see here.
-
Previous
Switching from Evernote to Apple's built-in Notes app -
Next
Linux File Permission Cheat Sheet
I often wish that when facing some key decisions in life, someone could tell me the best course of action so that I would not waste my precious time. Putting myself in others' shoes, I therefore write blogs often, hoping to record in this tiny corner of the vast Internet the once-in-a-lifetime experiences that matter to me, and to help those who seek help.