Skip to content

Commit a617e2f

Browse files
Migrated Data-Structures-3 bundle
1 parent e0e2724 commit a617e2f

File tree

6 files changed

+183
-0
lines changed

6 files changed

+183
-0
lines changed

docs/data-structures/img/mo.png

146 KB
Loading

docs/data-structures/img/trie.png

67.1 KB
Loading

docs/data-structures/index.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,12 +22,14 @@ Bilgisayar biliminde veri yapıları, belirli bir eleman kümesi üzerinde verim
2222
### [Deque](deque.md)
2323
### [Fenwick Tree](fenwick-tree.md)
2424
### [Segment Tree](segment-tree.md)
25+
### [Trie](trie.md)
2526

2627
## Statik Veri Yapıları
2728

2829
### [Prefix Sum](prefix-sum.md)
2930
### [Sparse Table](sparse-table.md)
3031
### [SQRT Decomposition](sqrt-decomposition.md)
32+
### [Mo's Algorithm](mo-algorithm.md)
3133

3234
## Common Problems
3335

Lines changed: 114 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,114 @@
1+
---
2+
title: Mo's Algorithm
3+
tags:
4+
- Data Structures
5+
- Mo's Algorithm
6+
---
7+
8+
This method will be a key for solving offline range queries on an array. By offline, we mean we can find the answers of these queries in any order we want and there are no updates. Let’s introduce a problem and construct an efficient solution for it.
9+
10+
You have an array a with $N$ elements such that it’s elements ranges from $1$ to $M$. You have to answer $Q$ queries. Each is in the same type. You will be given a range $[l, r]$ for each query, you have to print how many different values are there in the subarray $[a_l , a_{l+1}..a_{r−1}, a_r]$.
11+
12+
First let’s find a naive solution and improve it. Remember the frequency array we mentioned before. We will keep a frequency array that contains only given subarray’s values. Number of values in this frequency array bigger than 0 will be our answer for given query. Then we have to update frequency array for next query. We will use $\mathcal{O}(N)$ time for each query, so total complexity will be $\mathcal{O}(Q \times N)$. Look at the code below for implementation.
13+
14+
```cpp
15+
class Query {
16+
public:
17+
int l, r, ind;
18+
Query(int l, int r, int ind) {
19+
this->l = l, this->r = r, this->ind = ind;
20+
}
21+
};
22+
23+
void del(int ind, vector<int> &a, vector<int> &F, int &num) {
24+
if (F[a[ind]] == 1) num--;
25+
F[a[ind]]--;
26+
}
27+
28+
void add(int ind, vector<int> &a, vector<int> &F, int &num) {
29+
if (F[a[ind]] == 0) num++;
30+
F[a[ind]]++;
31+
}
32+
33+
vector<int> solve(vector<int> &a, vector<Query> &q) {
34+
int Q = q.size(), N = a.size();
35+
int M = *max_element(a.begin(), a.end());
36+
vector<int> F(M + 1, 0); // This is frequency array we mentioned before
37+
vector<int> ans(Q, 0);
38+
int l = 0, r = -1, num = 0;
39+
for (int i = 0; i < Q; i++) {
40+
int nl = q[i].l, nr = q[i].r;
41+
while (l < nl) del(l++, a, F, num);
42+
while (l > nl) add(--l, a, F, num);
43+
while (r > nr) del(r--, a, F, num);
44+
while (r < nr) add(++r, a, F, num);
45+
ans[q[i].ind] = num;
46+
}
47+
return ans;
48+
}
49+
```
50+
51+
Time complexity for each query here is $\mathcal{O}(N)$. So total complexity is $\mathcal{O}(Q \times N)$. Just by changing the order of queries we will reduce this complexity to $\mathcal{O}((Q + N) \times \sqrt N)$.
52+
53+
## Mo's Algorithm
54+
55+
We will change the order of answering the queries such that overall complexity will be reduced drastically. We will use following cmp function to sort our queries and will answer them in this sorted order. Block size here is $\mathcal{O}(\sqrt N)$.
56+
57+
```cpp
58+
bool operator<(Query other) const {
59+
return make_pair(l / block_size, r) <
60+
make_pair(other.l / block_size, other.r);
61+
}
62+
```
63+
64+
Why does that work? Let’s examine what we do here first then find the complexity. We divide $l$'s of queries into blocks. Block number of a given $l$ is $l$ blocksize (integer division). We sort the queries first by their block numbers then for same block numbers, we sort them by their $r$'s. Sorting all queries will take $\mathcal{O}(Q \times log{Q})$ time. Let’s look at how many times we will call add and del operations to change current $r$. For the same block $r$'s always increases. So for same block it is $\mathcal{O}(N)$ since it can only increase. Since there are $N$ blocksize blocks in total, it will be $\mathcal{O}(N \times N / \text{block\_size})$ operations in total. For same block, add and del operations that changes $l$ will be called at most $\mathcal{O}(\text{block\_size})$ times for each query, since if block number is same then their $l$'s must differ at most by $\mathcal{O}(\text{block\_size})$. So overall it is $\mathcal{O}(Q \times \text{block\_size})$. Also when consecutive queries has different block numbers we will perform at most $\mathcal{O}(N)$ operations, but notice that there are at most $\mathcal{O}(N \div \text{block\_size})$ such consecutive queries, so it doesn't change the overall time complexity. If we pick $block\_size = \sqrt N$ overall complexity will be $\mathcal{O}((Q + N) \times \sqrt N)$. Full code is given below.
65+
66+
<figure markdown="span" style="width: 64%">
67+
![Example for the Algorithm](img/mo.png)
68+
<figcaption>Example for the Algorithm</figcaption>
69+
</figure>
70+
71+
```cpp
72+
int block_size;
73+
74+
class Query {
75+
public:
76+
int l, r, ind;
77+
Query(int l, int r, int ind) {
78+
this->l = l, this->r = r, this->ind = ind;
79+
}
80+
bool operator<(Query other) const {
81+
return make_pair(l / block_size, r) <
82+
make_pair(other.l / block_size, other.r);
83+
}
84+
};
85+
86+
void del(int ind, vector<int> &a, vector<int> &F, int &num) {
87+
if (F[a[ind]] == 1) num--;
88+
F[a[ind]]--;
89+
}
90+
91+
void add(int ind, vector<int> &a, vector<int> &F, int &num) {
92+
if (F[a[ind]] == 0) num++;
93+
F[a[ind]]++;
94+
}
95+
96+
vector<int> solve(vector<int> &a, vector<Query> &q) {
97+
int Q = q.size(), N = a.size();
98+
int M = *max_element(a.begin(), a.end());
99+
block_size = sqrt(N);
100+
sort(q.begin(), q.end());
101+
vector<int> F(M + 1, 0); // This is frequency array we mentioned before
102+
vector<int> ans(Q, 0);
103+
int l = 0, r = -1, num = 0;
104+
for (int i = 0; i < Q; i++) {
105+
int nl = q[i].l, nr = q[i].r;
106+
while (l < nl) del(l++, a, F, num);
107+
while (l > nl) add(--l, a, F, num);
108+
while (r > nr) del(r--, a, F, num);
109+
while (r < nr) add(++r, a, F, num);
110+
ans[q[i].ind] = num;
111+
}
112+
return ans;
113+
}
114+
```

docs/data-structures/trie.md

Lines changed: 66 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,66 @@
1+
---
2+
title: Trie
3+
tags:
4+
- Data Structures
5+
- Trie
6+
---
7+
8+
Trie is an efficient information reTrieval data structure. Using Trie, search complexities can be brought to optimal limit (key length). If we store keys in binary search tree, a well balanced BST will need time proportional to $M \times log N$, where $M$ is maximum string length and $N$ is number of keys in tree. Using Trie, we can search the key in $\mathcal{O}(M)$ time. However the penalty is on Trie storage requirements (Please refer [Applications of Trie](https://www.geeksforgeeks.org/advantages-trie-data-structure/) for more details)
9+
10+
11+
<figure markdown="span" style="width: 36%">
12+
![Trie Structure https://www.geeksforgeeks.org/wp-content/uploads/Trie.png](img/trie.png)
13+
<figcaption>Trie Structure. https://www.geeksforgeeks.org/wp-content/uploads/Trie.png</figcaption>
14+
</figure>
15+
16+
Every node of Trie consists of multiple branches. Each branch represents a possible character of keys. We need to mark the last node of every key as end of word node. A Trie node field isEndOfWord is used to distinguish the node as end of word node. A simple structure to represent nodes of English alphabet can be as following,
17+
18+
```cpp
19+
// Trie node
20+
class TrieNode {
21+
public:
22+
TrieNode *children[ALPHABET_SIZE];
23+
bool isEndOfWord;
24+
TrieNode() {
25+
isEndOfWord = false;
26+
for (int i = 0; i < ALPHABET SIZE; i++)
27+
children[i] = NULL;
28+
}
29+
};
30+
```
31+
32+
## Insertion
33+
34+
Inserting a key into Trie is simple approach. Every character of input key is inserted as an individual Trie node. Note that the children is an array of pointers (or references) to next level Trie nodes. The key character acts as an index into the array children. If the input key is new or an extension of existing key, we need to construct non-existing nodes of the key, and mark end of word for last node. If the input key is prefix of existing key in Trie, we simply mark the last node of key as end of word. The key length determines Trie depth.
35+
36+
```cpp
37+
void insert(struct TrieNode *root, string key) {
38+
struct TrieNode *pCrawl = root;
39+
for (int i = 0; i < key.length(); i++) {
40+
int index = key[i] - 'a';
41+
if (!pCrawl->children[index])
42+
pCrawl->children[index] = new TrieNode;
43+
pCrawl = pCrawl->children[index];
44+
}
45+
pCrawl->isEndOfWord = true;
46+
}
47+
```
48+
49+
## Search
50+
51+
Searching for a key is similar to insert operation, however we only compare the characters and move down. The search can terminate due to end of string or lack of key in Trie. In the former case, if the isEndofWord field of last node is true, then the key exists in Trie. In the second case, the search terminates without examining all the characters of key, since the key is not present in Trie.
52+
53+
```cpp
54+
bool search(struct TrieNode *root, string key) {
55+
TrieNode *pCrawl = root;
56+
for (int i = 0; i < key.length(); i++) {
57+
int index = key[i] - 'a';
58+
if (!pCrawl->children[index])
59+
return false;
60+
pCrawl = pCrawl->children[index];
61+
}
62+
return (pCrawl != NULL && pCrawl->isEndOfWord);
63+
}
64+
```
65+
66+
Insert and search costs $\mathcal{O}(\text{key\_length})$. However the memory requirements of Trie high. It is $\mathcal{O}(\text{ALPHABET SIZE} \times \text{key\_length} \times N)$ where $N$ is number of keys in Trie. There are efficient representation of trie nodes (e.g. compressed trie, ternary search tree, etc.) to minimize memory requirements of trie.

docs/graph/index.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,7 @@ title: Graph
1515
### [Depth First Search](depth-first-search.md)
1616
### [Breadth First Search](breadth-first-search.md)
1717
### [Cycle Finding](cycle-finding.md)
18+
### [Bipartite Checking](bipartite-checking.md)
1819
### [Union Find](union-find.md)
1920
### [Shortest Path](shortest-path.md)
2021
### [Minimum Spanning Tree](minimum-spanning-tree.md)

0 commit comments

Comments
 (0)