Migrated Data-Structures-3 bundle

muratbiberoglu · muratbiberoglu · commit a617e2f797e8 · 2025-03-07T15:36:29.000+03:00
diff --git a/docs/data-structures/img/mo.png b/docs/data-structures/img/mo.png
diff --git a/docs/data-structures/img/trie.png b/docs/data-structures/img/trie.png
diff --git a/docs/data-structures/index.md b/docs/data-structures/index.md
@@ -22,12 +22,14 @@ Bilgisayar biliminde veri yapıları, belirli bir eleman kümesi üzerinde verim
 ### [Deque](deque.md)
 ### [Fenwick Tree](fenwick-tree.md)
 ### [Segment Tree](segment-tree.md)
+### [Trie](trie.md)
 
 ## Statik Veri Yapıları
 
 ### [Prefix Sum](prefix-sum.md)
 ### [Sparse Table](sparse-table.md)
 ### [SQRT Decomposition](sqrt-decomposition.md)
+### [Mo's Algorithm](mo-algorithm.md)
 
 ## Common Problems
 
diff --git a/docs/data-structures/mo-algorithm.md b/docs/data-structures/mo-algorithm.md
@@ -0,0 +1,114 @@
+---
+title: Mo's Algorithm
+tags:
+    - Data Structures
+    - Mo's Algorithm
+---
+
+This method will be a key for solving offline range queries on an array. By offline, we mean we can find the answers of these queries in any order we want and there are no updates. Let’s introduce a problem and construct an efficient solution for it.
+
+You have an array a with $N$ elements such that it’s elements ranges from $1$ to $M$. You have to answer $Q$ queries. Each is in the same type. You will be given a range $[l, r]$ for each query, you have to print how many different values are there in the subarray $[a_l , a_{l+1}..a_{r−1}, a_r]$.
+
+First let’s find a naive solution and improve it. Remember the frequency array we mentioned before. We will keep a frequency array that contains only given subarray’s values. Number of values in this frequency array bigger than 0 will be our answer for given query. Then we have to update frequency array for next query. We will use $\mathcal{O}(N)$ time for each query, so total complexity will be $\mathcal{O}(Q \times N)$. Look at the code below for implementation.
+
+```cpp
+class Query {
+   public:
+    int l, r, ind;
+    Query(int l, int r, int ind) {
+        this->l = l, this->r = r, this->ind = ind;
+    }
+};
+
+void del(int ind, vector<int> &a, vector<int> &F, int &num) {
+    if (F[a[ind]] == 1) num--;
+    F[a[ind]]--;
+}
+
+void add(int ind, vector<int> &a, vector<int> &F, int &num) {
+    if (F[a[ind]] == 0) num++;
+    F[a[ind]]++;
+}
+
+vector<int> solve(vector<int> &a, vector<Query> &q) {
+    int Q = q.size(), N = a.size();
+    int M = *max_element(a.begin(), a.end());
+    vector<int> F(M + 1, 0);  // This is frequency array we mentioned before
+    vector<int> ans(Q, 0);
+    int l = 0, r = -1, num = 0;
+    for (int i = 0; i < Q; i++) {
+        int nl = q[i].l, nr = q[i].r;
+        while (l < nl) del(l++, a, F, num);
+        while (l > nl) add(--l, a, F, num);
+        while (r > nr) del(r--, a, F, num);
+        while (r < nr) add(++r, a, F, num);
+        ans[q[i].ind] = num;
+    }
+    return ans;
+}
+```
+
+Time complexity for each query here is $\mathcal{O}(N)$. So total complexity is $\mathcal{O}(Q \times N)$. Just by changing the order of queries we will reduce this complexity to $\mathcal{O}((Q + N) \times \sqrt N)$.
+
+## Mo's Algorithm
+
+We will change the order of answering the queries such that overall complexity will be reduced drastically. We will use following cmp function to sort our queries and will answer them in this sorted order. Block size here is $\mathcal{O}(\sqrt N)$.
+
+```cpp
+bool operator<(Query other) const {
+    return make_pair(l / block_size, r) <
+        make_pair(other.l / block_size, other.r);
+}
+```
+
+Why does that work? Let’s examine what we do here first then find the complexity. We divide $l$'s of queries into blocks. Block number of a given $l$ is $l$ blocksize (integer division). We sort the queries first by their block numbers then for same block numbers, we sort them by their $r$'s. Sorting all queries will take $\mathcal{O}(Q \times log{Q})$ time. Let’s look at how many times we will call add and del operations to change current $r$. For the same block $r$'s always increases. So for same block it is $\mathcal{O}(N)$ since it can only increase. Since there are $N$ blocksize blocks in total, it will be $\mathcal{O}(N \times N / \text{block\_size})$ operations in total. For same block, add and del operations that changes $l$ will be called at most $\mathcal{O}(\text{block\_size})$ times for each query, since if block number is same then their $l$'s must differ at most by $\mathcal{O}(\text{block\_size})$. So overall it is $\mathcal{O}(Q \times \text{block\_size})$. Also when consecutive queries has different block numbers we will perform at most $\mathcal{O}(N)$ operations, but notice that there are at most $\mathcal{O}(N \div \text{block\_size})$ such consecutive queries, so it doesn't change the overall time complexity. If we pick $block\_size = \sqrt N$ overall complexity will be $\mathcal{O}((Q + N) \times \sqrt N)$. Full code is given below.
+
+<figure markdown="span" style="width: 64%">
+![Example for the Algorithm](img/mo.png)
+<figcaption>Example for the Algorithm</figcaption>
+</figure>
+
+```cpp
+int block_size;
+
+class Query {
+   public:
+    int l, r, ind;
+    Query(int l, int r, int ind) {
+        this->l = l, this->r = r, this->ind = ind;
+    }
+    bool operator<(Query other) const {
+        return make_pair(l / block_size, r) <
+               make_pair(other.l / block_size, other.r);
+    }
+};
+
+void del(int ind, vector<int> &a, vector<int> &F, int &num) {
+    if (F[a[ind]] == 1) num--;
+    F[a[ind]]--;
+}
+
+void add(int ind, vector<int> &a, vector<int> &F, int &num) {
+    if (F[a[ind]] == 0) num++;
+    F[a[ind]]++;
+}
+
+vector<int> solve(vector<int> &a, vector<Query> &q) {
+    int Q = q.size(), N = a.size();
+    int M = *max_element(a.begin(), a.end());
+    block_size = sqrt(N);
+    sort(q.begin(), q.end());
+    vector<int> F(M + 1, 0);  // This is frequency array we mentioned before
+    vector<int> ans(Q, 0);
+    int l = 0, r = -1, num = 0;
+    for (int i = 0; i < Q; i++) {
+        int nl = q[i].l, nr = q[i].r;
+        while (l < nl) del(l++, a, F, num);
+        while (l > nl) add(--l, a, F, num);
+        while (r > nr) del(r--, a, F, num);
+        while (r < nr) add(++r, a, F, num);
+        ans[q[i].ind] = num;
+    }
+    return ans;
+}
+```
diff --git a/docs/data-structures/trie.md b/docs/data-structures/trie.md
@@ -0,0 +1,66 @@
+---
+title: Trie
+tags:
+    - Data Structures
+    - Trie
+---
+
+Trie is an efficient information reTrieval data structure. Using Trie, search complexities can be brought to optimal limit (key length). If we store keys in binary search tree, a well balanced BST will need time proportional to $M \times log N$, where $M$ is maximum string length and $N$ is number of keys in tree. Using Trie, we can search the key in $\mathcal{O}(M)$ time. However the penalty is on Trie storage requirements (Please refer [Applications of Trie](https://www.geeksforgeeks.org/advantages-trie-data-structure/) for more details)
+
+
+<figure markdown="span" style="width: 36%">
+![Trie Structure https://www.geeksforgeeks.org/wp-content/uploads/Trie.png](img/trie.png)
+<figcaption>Trie Structure. https://www.geeksforgeeks.org/wp-content/uploads/Trie.png</figcaption>
+</figure>
+
+Every node of Trie consists of multiple branches. Each branch represents a possible character of keys. We need to mark the last node of every key as end of word node. A Trie node field isEndOfWord is used to distinguish the node as end of word node. A simple structure to represent nodes of English alphabet can be as following,
+
+```cpp
+// Trie node
+class TrieNode {
+   public:
+    TrieNode *children[ALPHABET_SIZE];
+    bool isEndOfWord;
+    TrieNode() {
+        isEndOfWord = false;
+        for (int i = 0; i < ALPHABET SIZE; i++)
+            children[i] = NULL;
+    }
+};
+```
+
+## Insertion
+
+Inserting a key into Trie is simple approach. Every character of input key is inserted as an individual Trie node. Note that the children is an array of pointers (or references) to next level Trie nodes. The key character acts as an index into the array children. If the input key is new or an extension of existing key, we need to construct non-existing nodes of the key, and mark end of word for last node. If the input key is prefix of existing key in Trie, we simply mark the last node of key as end of word. The key length determines Trie depth.
+
+```cpp
+void insert(struct TrieNode *root, string key) {
+    struct TrieNode *pCrawl = root;
+    for (int i = 0; i < key.length(); i++) {
+        int index = key[i] - 'a';
+        if (!pCrawl->children[index])
+            pCrawl->children[index] = new TrieNode;
+        pCrawl = pCrawl->children[index];
+    }
+    pCrawl->isEndOfWord = true;
+}
+```
+
+## Search
+
+Searching for a key is similar to insert operation, however we only compare the characters and move down. The search can terminate due to end of string or lack of key in Trie. In the former case, if the isEndofWord field of last node is true, then the key exists in Trie. In the second case, the search terminates without examining all the characters of key, since the key is not present in Trie.
+
+```cpp
+bool search(struct TrieNode *root, string key) {
+    TrieNode *pCrawl = root;
+    for (int i = 0; i < key.length(); i++) {
+        int index = key[i] - 'a';
+        if (!pCrawl->children[index])
+            return false;
+        pCrawl = pCrawl->children[index];
+    }
+    return (pCrawl != NULL && pCrawl->isEndOfWord);
+}
+```
+
+Insert and search costs $\mathcal{O}(\text{key\_length})$. However the memory requirements of Trie high. It is $\mathcal{O}(\text{ALPHABET SIZE} \times \text{key\_length} \times N)$ where $N$ is number of keys in Trie. There are efficient representation of trie nodes (e.g. compressed trie, ternary search tree, etc.) to minimize memory requirements of trie.
diff --git a/docs/graph/index.md b/docs/graph/index.md
@@ -15,6 +15,7 @@ title: Graph
 ### [Depth First Search](depth-first-search.md)
 ### [Breadth First Search](breadth-first-search.md)
 ### [Cycle Finding](cycle-finding.md)
+### [Bipartite Checking](bipartite-checking.md)
 ### [Union Find](union-find.md)
 ### [Shortest Path](shortest-path.md)
 ### [Minimum Spanning Tree](minimum-spanning-tree.md)