diff --git a/docs/dynamic-programming/bitmask_dp.md b/docs/dynamic-programming/bitmask_dp.md
new file mode 100644
index 0000000..9a0bae1
--- /dev/null
+++ b/docs/dynamic-programming/bitmask_dp.md
@@ -0,0 +1,84 @@
+# Bitmask DP
+
+## What is Bitmask?
+
+Let’s say that we have a set of objects. How can we represent a subset of this set? One way is using a map and mapping each object with a Boolean value indicating whether the object is picked. Another way is, if the objects can be indexed by integers, we can use a Boolean array. However, this can be slow due to the operations of the map and array structures. If the size of the set is not too large (less than 64), a bitmask is much more useful and convenient.
+
+An integer is a sequence of bits. Thus, we can use integers to represent a small set of Boolean values. We can perform all the set operations using bit operations. These bit operations are faster than map and array operations, and the time difference may be significant in some problems.
+
+In a bitmask, the \( i \)-th bit from the right represents the \( i \)-th object. For example, let \( A = \{1, 2, 3, 4, 5\} \), we can represent \( B = \{1, 2, 4\} \) with the 11 (01011) bitmask.
+
+---
+
+## Bitmask Operations
+
+- **Add the \( i \)-th object to the subset:**  
+  Set the \( i \)-th bit to 1:  
+  \( \text{mask } = \text{mask } | \text{ } (1 << i) \)
+
+- **Remove the \( i \)-th object from the subset:**  
+  Set the \( i \)-th bit to 0:  
+  \( \text{mask } = \text{mask } \& \sim (1 << i) \)
+
+- **Check whether the \( i \)-th object is in the subset:**  
+  Check if the \( i \)-th bit is set:  
+  \( \text{mask } \& \text{ } (1 << i) \).  
+  If the expression is equal to 1, the \( i \)-th object is in the subset. If the expression is equal to 0, the \( i \)-th object is not in the subset.
+
+- **Toggle the existence of the \( i \)-th object:**  
+  XOR the \( i \)-th bit with 1, turning 1 into 0 and 0 into 1:  
+  \( \text{mask} = \text{mask}\)  ^ \( (1 << i) \)
+
+- **Count the number of objects in the subset:**  
+  Use a built-in function to count the number of 1’s in an integer variable:  
+  `__builtin_popcount(mask)` for integers or `__builtin_popcountll(mask)` for long longs.
+
+---
+
+## Iterating Over Subsets
+
+- **Iterate through all subsets of a set with size \( n \):**  
+  \( \text{for (int x = 0; x < (1 << n); ++x)} \)
+
+- **Iterate through all subsets of a subset with the mask \( y \):**  
+  \( \text{for (int x = y; x > 0; x = (y \& (x − 1)))} \)
+
+---
+
+## Task Assignment Problem
+
+There are \( N \) people and \( N \) tasks, and each task is going to be allocated to a single person. We are also given a matrix `cost` of size \( N \times N \), where `cost[i][j]` denotes how much a person is going to charge for a task. We need to assign each task to a person such that the total cost is minimized. Note that each task is allocated to only one person, and each person is allocated only one task.
+
+### Naive Approach:
+
+Try \( N! \) possible assignments.  
+**Time complexity:** \( O(N!) \).
+
+### DP Approach:
+
+For every possible subset, find the new subsets that can be generated from it and update the DP array. Here, we use bitmasking to represent subsets and iterate over them.  
+**Time complexity:** \( O(2^N \times N) \).
+
+**Note:** The [Hungarian Algorithm](https://en.wikipedia.org/wiki/Hungarian_algorithm) solves this problem in \( O(N^3) \) time complexity.
+
+Solution code for DP approach:
+
+```cpp
+for (int mask = 0; mask < (1 << n); ++mask)
+{
+    for (int j = 0; j < n; ++j)
+    {
+        if((mask & (1 << j)) == 0) // jth task not assigned
+        {
+            dp[mask | (1 << j)] = min(dp[mask | (1 << j)], dp[mask] + cost[__builtin_popcount(mask)][j])
+        }
+    }
+}
+// after this operation our answer stored in dp[(1 << N) - 1]
+```
+
+---
+
+## References
+
+- [Bitmask Tutorial on HackerEarth](https://www.hackerearth.com/practice/algorithms/dynamic-programming/bit-masking/tutorial/)
\ No newline at end of file
diff --git a/docs/dynamic-programming/common_dp_problems.md b/docs/dynamic-programming/common_dp_problems.md
new file mode 100644
index 0000000..f0e46aa
--- /dev/null
+++ b/docs/dynamic-programming/common_dp_problems.md
@@ -0,0 +1,231 @@
+# Common Dynamic Programming Problems
+
+## Coin Problem
+
+As discussed earlier, the Greedy approach doesn’t work all the time for the coin problem. For example, if the coins are \{4, 3, 1\} and the target sum is \(6\), the greedy algorithm produces the solution \(4+1+1\), while the optimal solution is \(3+3\). This is where Dynamic Programming (DP) helps.
+
+### Solution
+
+#### Approach:
+
+1. If \( V == 0 \), then 0 coins are required.
+2. If \( V > 0 \), compute \( \text{minCoins}(coins[0..m-1], V) = \min \{ 1 + \text{minCoins}(V - \text{coin}[i]) \} \) for all \( i \) where \( \text{coin}[i] \leq V \).
+
+```python
+def minCoins(coins, target):
+    # base case
+    if (V == 0):
+        return 0
+
+    n = len(coins)
+    # Initialize result
+    res = sys.maxsize
+
+    # Try every coin that has smaller value than V
+    for i in range(0, n):
+        if (coins[i] <= target):
+            sub_res = minCoins(coins, target-coins[i])
+
+    # Check for INT_MAX to avoid overflow and see if
+    # result can minimized
+    if (sub_res != sys.maxsize and sub_res + 1 < res):
+        res = sub_res + 1
+
+    return res
+```
+
+## Knapsack Problem
+
+We are given the weights and values of \( n \) items, and we are to put these items in a knapsack of capacity \( W \) to get the maximum total value. In other words, we are given two integer arrays `val[0..n-1]` and `wt[0..n-1]`, which represent the values and weights associated with \( n \) items. We are also given an integer \( W \), which represents the knapsack's capacity. Our goal is to find out the maximum value subset of `val[]` such that the sum of the weights of this subset is smaller than or equal to \( W \). We cannot break an item; we must either pick the complete item or leave it.
+
+#### Approach:
+
+There are two cases for every item:
+1. The item is included in the optimal subset.
+2. The item is not included in the optimal subset.
+
+The maximum value that can be obtained from \( n \) items is the maximum of the following two values:
+1. Maximum value obtained by \( n-1 \) items and \( W \) weight (excluding the \( n \)-th item).
+2. Value of the \( n \)-th item plus the maximum value obtained by \( n-1 \) items and \( W - \text{weight of the } n \)-th item (including the \( n \)-th item).
+
+If the weight of the \( n \)-th item is greater than \( W \), then the \( n \)-th item cannot be included, and case 1 is the only possibility.
+
+For example:
+
+- Knapsack max weight: \( W = 8 \) units
+- Weight of items: \( \text{wt} = \{3, 1, 4, 5\} \)
+- Values of items: \( \text{val} = \{10, 40, 30, 50\} \)
+- Total items: \( n = 4 \)
+
+The sum \( 8 \) is possible with two combinations: \{3, 5\} with a total value of 60, and \{1, 3, 4\} with a total value of 80. However, a better solution is \{1, 5\}, which has a total weight of 6 and a total value of 90.
+
+### Recursive Solution
+
+```python
+def knapSack(W , wt , val , n):
+
+    # Base Case
+    if (n == 0 or W == 0):
+        return 0
+
+    # If weight of the nth item is more than Knapsack of capacity
+    # W, then this item cannot be included in the optimal solution
+    if (wt[n-1] > W):
+        return knapSack(W, wt, val, n - 1)
+
+    # return the maximum of two cases:
+    # (1) nth item included
+    # (2) not included
+    else:
+        return max(val[n-1] + knapSack(W - wt[n - 1], wt, val, n - 1), knapSack(W, wt, val, n - 1))
+```
+
+### Dynamic Programming Solution
+
+It should be noted that the above function computes the same subproblems again and again. Time complexity of this naive recursive solution is exponential \(2^n\).
+Since suproblems are evaluated again, this problem has Overlapping Subproblems property. Like other typical Dynamic Programming(DP) problems, recomputations of same subproblems can be avoided by constructing a temporary array \(K[][]\) in bottom up manner. Following is Dynamic Programming based implementation.
+
+```python
+def knapSack(W, wt, val, n):
+    K = [[0 for x in range(W + 1)] for x in range(n + 1)]
+
+    # Build table K[][] in bottom up manner
+    for (i in range(n + 1)):
+        for (w in range(W + 1)):
+            if (i == 0 or w == 0):
+                K[i][w] = 0
+            elif (wt[i - 1] <= w):
+                K[i][w] = max(val[i - 1] + K[i - 1][w - wt[i - 1]], K[i - 1][w])
+            else:
+                K[i][w] = K[i - 1][w]
+
+    return K[n][W]
+```
+
+## Longest Common Substring (LCS) Problem
+
+We are given two strings \( X \) and \( Y \), and our task is to find the length of the longest common substring.
+
+### Sample Case:
+
+- Input: \( X = "inzvahackerspace" \), \( Y = "spoilerspoiler" \)
+- Output: 4
+
+The longest common substring is "ersp" and is of length 4.
+
+#### Approach:
+
+Let \( m \) and \( n \) be the lengths of the first and second strings, respectively. A simple solution is to consider all substrings of the first string one by one and check if they are substrings of the second string. Keep track of the maximum-length substring. There will be \( O(m^2) \) substrings, and checking if one is a substring of the other will take \( O(n) \) time. Thus, the overall time complexity is \( O(n \cdot m^2) \).
+
+Dynamic programming can reduce this to \( O(m \cdot n) \). The idea is to find the length of the longest common suffix for all substrings of both strings and store these lengths in a table. The longest common suffix has the following property:
+
+\[
+LCSuff(X, Y, m, n) = LCSuff(X, Y, m-1, n-1) + 1 \text{ if } X[m-1] = Y[n-1]
+\]
+Otherwise, \( LCSuff(X, Y, m, n) = 0 \).
+
+The maximum length of the Longest Common Suffix is the Longest Common Substring.
+
+### DP - Iterative
+
+```python
+def LCSubStr(X, Y):
+    m = len(X)
+    n = len(Y)
+
+    # Create a table to store lengths of
+    # longest common suffixes of substrings.
+    # Note that LCSuff[i][j] contains the
+    # length of longest common suffix of
+    # X[0...i−1] and Y[0...j−1]. The first
+    # row and first column entries have no
+    # logical meaning, they are used only
+    # for simplicity of the program.
+
+    # LCSuff is the table with zero
+    # value initially in each cell
+    LCSuff = [[0 for k in range(n+1)] for l in range(m + 1)]
+
+    # To store the length of
+    # longest common substring
+    result = 0
+
+    # Following steps to build
+    # LCSuff[m+1][n+1] in bottom up fashion
+    for (i in range(m + 1)):
+        for (j in range(n + 1)):
+    if (i == 0 or j == 0):
+                LCSuff[i][j] = 0
+            elif (X[i - 1] == Y[j - 1]):
+                LCSuff[i][j] = LCSuff[i - 1][j - 1] + 1
+                result = max(result, LCSuff[i][j])
+            else:
+                LCSuff[i][j] = 0
+    return result
+```
+
+### DP - Recursive
+
+```python
+def lcs(int i, int j, int count):
+    if (i == 0 or j == 0):
+        return count
+
+    if (X[i - 1] == Y[j - 1]):
+        count = lcs(i - 1, j - 1, count + 1)
+
+    count = max(count, max(lcs(i, j - 1, 0), lcs(i - 1, j, 0)))
+    return count
+```
+
+## Longest Increasing Subsequence (LIS) Problem
+
+The Longest Increasing Subsequence (LIS) problem is to find the length of the longest subsequence of a given sequence such that all elements of the subsequence are sorted in increasing order.
+
+For example, given the array \([0, 8, 4, 12, 2, 10, 6, 14, 1, 9, 5, 13, 3, 11, 7, 15]\), the longest increasing subsequence has a length of 6, and it is \{0, 2, 6, 9, 11, 15\}.
+
+### Solution
+
+A naive, brute-force approach is to generate every possible subsequence, check for monotonicity, and keep track of the longest one. However, this is prohibitively expensive, as generating each subsequence takes \( O(2^N) \) time.
+
+Instead, we can use recursion to solve this problem and then optimize it with dynamic programming. We assume that we have a function that gives us the length of the longest increasing subsequence up to a certain index.
+
+The base cases are:
+- The empty list, which returns 0.
+- A list with one element, which returns 1.
+
+For every index \( i \), calculate the longest increasing subsequence up to that point. The result can only be extended with the last element if the last element is greater than \( \text{arr}[i] \), as otherwise, the sequence wouldn’t be increasing.
+
+```python
+def longest_increasing_subsequence(arr):
+    if (not arr):
+        return 0
+    if (len(arr) == 1):
+        return 1
+
+    max_ending_here = 0
+    for (i in range(len(arr))):
+        ending_at_i = longest_increasing_subsequence(arr[:i])
+        if (arr[-1] > arr[i - 1] and ending_at_i + 1 > max_ending_here):
+            max_ending_here = ending_at_i + 1
+    return max_ending_here
+```
+
+This is really slow due to repeated subcomputations (exponential in time). So, let’s use dynamic
+programming to store values to recompute them for later.
+
+We’ll keep an array A of length N, and A[i] will contain the length of the longest increasing subsequence ending at i. We can then use the same recurrence but look it up in the array instead:
+
+```python
+def longest_increasing_subsequence(arr):
+    if (not arr):
+        return 0
+    cache = [1] * len(arr)
+    for (i in range(1, len(arr))):
+        for (j in range(i)):
+            if (arr[i] > arr[j]):
+                cache[i] = max(cache[i], cache[j] + 1)
+    return max(cache)
+```
+
+This now runs in \( O(N^2) \) time and \( O(N) \) space.
\ No newline at end of file
diff --git a/docs/dynamic-programming/digit_dp.md b/docs/dynamic-programming/digit_dp.md
new file mode 100644
index 0000000..e5e99a5
--- /dev/null
+++ b/docs/dynamic-programming/digit_dp.md
@@ -0,0 +1,104 @@
+# Digit DP
+
+Problems that require the calculation of how many numbers there are between two values (say, \( A \) and \( B \)) that satisfy a particular property can be solved using digit dynamic programming (Digit DP).
+
+---
+
+## How to Work on Digits
+
+While constructing our numbers recursively (from the left), we need a method to check if our number is still smaller than the given boundary number. To achieve this, we keep a variable called "strict" while branching, which limits our ability to select digits that are larger than the corresponding digit of the boundary number.
+
+Let’s suppose the boundary number is \( A \). We start filling the number from the left (most significant digit) and set `strict` to `true`, meaning we cannot select any digit larger than the corresponding digit of \( A \). As we branch:
+
+- Values less than the corresponding digit of \( A \) will now be non-strict (`strict = false`) because we guarantee that the number will be smaller than \( A \) after this point.
+- For values equal to the corresponding digit of \( A \), the strictness continues to be `true`.
+
+---
+
+## Counting Problem Example
+
+**Problem:** How many numbers \( x \) are there in the range \( A \) to \( B \), where the digit \( d \) occurs exactly \( k \) times in \( x \)?
+
+**Constraints:** \( A, B < 10^{60}, k < 60 \).
+
+### Brute Force Approach:
+
+The brute-force solution would involve iterating over all the numbers in the range \([A, B]\) and counting the occurrences of the digit \( d \) one by one for each number. This has a time complexity of \( O(N \log_{10}(N)) \), which is too large for such constraints, and we need a more efficient approach.
+
+### Recursive Approach:
+
+We can recursively fill the digits of our number starting from the leftmost digit. At each step, we branch into 3 possibilities:
+
+1. Pick a number that is **not** \( d \) and smaller than the corresponding digit of the boundary number.
+2. Pick the digit \( d \).
+3. Pick a number that is equal to the corresponding digit of the boundary number.
+
+The depth of recursion is equal to the number of digits in the decimal representation of the boundary number, leading to a time complexity of \( O(3^{\log_{10} N}) \). Although this is better than brute force, it is still not efficient enough.
+
+### Recursive Approach with Memoization:
+
+We can further optimize this approach using memoization. We represent a DP state by \((\text{current index}, \text{current strictness}, \text{number of } d's)\), which denotes the number of possible configurations of the remaining digits after picking the current digit. We use a `dp[\log_{10} N][2][\log_{10} N]` array, where each value is computed at most once. Therefore, the worst-case time complexity is \( O((\log_{10} N)^2) \).
+
+Solution Code:
+
+```cpp
+#include <bits/stdc++.h>
+using namespace std;
+#define ll long long
+ll A, B, d, k, dg; // dg: digit count
+vector <ll> v; // digit vector
+ll dp[25][2][25];
+void setup(ll a)
+{
+    memset(dp,0,sizeof dp);
+    v.clear();
+    ll tmp = a;
+    while(tmp)
+    {
+        v.push_back(tmp%10);
+        tmp/=10;
+    }
+    dg = (ll)v.size();
+    reverse(v.begin(), v.end());
+}
+
+ll rec(int idx, bool strict, int count)
+{
+    if(dp[idx][strict][count]) return dp[idx][strict][count];
+    if(idx == dg or count > k) return (count == k);
+    ll sum = 0;
+    if(strict)
+    {
+        // all <v[idx] if d is included -1
+        sum += rec(idx + 1, 0, count) * (v[idx] - (d < v[idx]));
+        // v[idx], if d==v[idx] send count+1
+        sum += rec(idx + 1, 1, count + (v[idx] == d) );
+        if(d < v[idx])
+        sum += rec(idx + 1, 0, count + 1); // d
+    }
+    else
+    {
+        sum += rec(idx + 1, 0, count) * (9); // other than d (10 - 1)
+        sum += rec(idx + 1, 0, count + 1); // d
+    }
+    return dp[idx][strict][count] = sum;
+}
+
+int main()
+{
+    cin >> A >> B >> d >> k;
+    setup(B);
+    ll countB = rec(0, 1, 0); //countB is answer of [0..B]
+    setup(A - 1);
+    ll countA = rec(0, 1, 0); //countA is answer of [0..A-1]
+    cout << fixed << countB - countA << endl; //difference gives us [A..B]
+}
+```
+
+---
+
+## References
+
+- [Digit DP on Codeforces](https://codeforces.com/blog/entry/53960)
+
+- [Digit DP on HackerRank](https://www.hackerrank.com/topics/digit-dp)
\ No newline at end of file
diff --git a/docs/dynamic-programming/dp_on_directed_acyclic_graphs.md b/docs/dynamic-programming/dp_on_directed_acyclic_graphs.md
new file mode 100644
index 0000000..0c986ec
--- /dev/null
+++ b/docs/dynamic-programming/dp_on_directed_acyclic_graphs.md
@@ -0,0 +1,84 @@
+# DP on Directed Acyclic Graphs (DAGs)
+
+As we know, the nodes of a directed acyclic graph (DAG) can be sorted topologically, and DP can be implemented efficiently using this topological order. 
+
+First, we can find the topological order with a [topological sort](https://en.wikipedia.org/wiki/Topological_sorting) in \( O(N) \) time complexity. Then, we can find the \( dp(V) \) values in topological order, where \( V \) is a node in the DAG and \( dp(V) \) is the answer for node \( V \). The answer and implementation will differ depending on the specific problem.
+
+---
+
+## Converting a DP Problem into a Directed Acyclic Graph
+
+Many DP problems can be converted into a DAG. Let’s explore why this is the case.
+
+While solving a DP problem, when we process a state, we evaluate it by considering all possible previous states. To do this, all of the previous states must be processed before the current state. From this perspective, some states depend on other states, forming a DAG structure.
+
+However, note that some DP problems cannot be converted into a DAG and may require [hyper-graphs](https://en.wikipedia.org/wiki/Hypergraph). (For more details, refer to [**Advanced Dynamic Programming in Semiring and Hypergraph Frameworks**](https://en.wikipedia.org/wiki/Hypergraph)).
+
+### Example Problem:
+
+There are \( N \) stones numbered \( 1, 2, ..., N \). For each \( i \) ( \( 1 \leq i \leq N \) ), the height of the \( i \)-th stone is \( h_i \). There is a frog initially on stone 1. The frog can jump to stone \( i+1 \) or stone \( i+2 \). The cost of a jump from stone \( i \) to stone \( j \) is \( | h_i − h_j | \). Find the minimum possible cost to reach stone \( N \).
+
+### Solution:
+
+We define \( dp[i] \) as the minimum cost to reach the \( i \)-th stone. The answer will be \( dp[N] \). The recurrence relation is defined as:
+
+\[
+dp[i] = \min(dp[i−1] + |h_i − h_{i−1}|, dp[i−2] + |h_i − h_{i−2}|)
+\]
+
+For \( N = 5 \), we can see that to calculate \( dp[5] \), we need to calculate \( dp[4] \) and \( dp[3] \). Similarly:
+
+- \( dp[4] \) depends on \( dp[3] \) and \( dp[2] \),
+- \( dp[3] \) depends on \( dp[2] \) and \( dp[1] \),
+- \( dp[2] \) depends on \( dp[1] \).
+
+These dependencies form a DAG, where the nodes represent the stones, and the edges represent the transitions between them based on the jumps.
+
+```mermaid
+graph LR
+    A(dp_1) --> B(dp_2);
+    A --> C(dp_3);
+    B --> D(dp_4);
+    B --> E(dp_5);
+    C --> D;
+    D --> E;
+```
+
+## DP on Directed Acyclic Graph Problem
+
+Given a DAG with \( N \) nodes and \( M \) weighted edges, find the **longest path** in the DAG.
+
+### Complexity:
+
+The time complexity for this problem is \( O(N + M) \), where \( N \) is the number of nodes and \( M \) is the number of edges.
+
+Solution Code:
+
+```cpp
+// topological sort is not written here so we will take tp as it is already sorted
+// note that tp is reverse topologically sorted
+// vector <int> tp
+// n , m and vector <pair<int,int>> adj is given.Pair denotes {node,weight}.
+// flag[] denotes whether a node is processed or not.Initially all zero.
+// dp[] is DP array.Initially all zero.
+
+for (int i = 0; i < (int)tp.size(); ++i)//processing in order
+{
+    int curNode = tp[i];
+
+    for (auto v : adj[curNode]) //iterate through all neighbours
+        if(flag[v.first]) //if a neighbour is already processed
+            dp[curNode] = max(dp[curNode] , dp[v.first] + v.second);
+
+    flag[curNode] = 1;
+}
+//answer is max(dp[1..n])
+```
+
+---
+
+## References
+
+- [NOI IOI training week-5](https://noi.ph/training/weekly/week5.pdf)
+
+- [DP on Graphs MIT](https://courses.csail.mit.edu/6.006/fall11/rec/rec19.pdf)
\ No newline at end of file
diff --git a/docs/dynamic-programming/dp_on_rooted_trees.md b/docs/dynamic-programming/dp_on_rooted_trees.md
new file mode 100644
index 0000000..3a133e1
--- /dev/null
+++ b/docs/dynamic-programming/dp_on_rooted_trees.md
@@ -0,0 +1,67 @@
+# DP on Rooted Trees
+
+In dynamic programming (DP) on rooted trees, we define functions for the nodes of the tree, which are calculated recursively based on the children of each node. One common DP state is usually associated with a node \(i\), representing the sub-tree rooted at node \(i\).
+
+---
+
+## Problem
+
+Given a tree \( T \) of \( N \) (1-indexed) nodes, where each node \( i \) has \( C_i \) coins attached to it, the task is to choose a subset of nodes such that no two adjacent nodes (nodes directly connected by an edge) are chosen, and the sum of coins attached to the chosen subset is maximized.
+
+### Approach:
+
+We define two functions, \( dp1(V) \) and \( dp2(V) \), as follows:
+
+- \( dp1(V) \): The optimal solution for the sub-tree of node \( V \) when node \( V \) **is included** in the answer.
+- \( dp2(V) \): The optimal solution for the sub-tree of node \( V \) when node \( V \) **is not included** in the answer.
+
+The final answer is the maximum of these two cases:
+
+\[
+\text{max}(dp1(V), dp2(V))
+\]
+
+### Recursive Definitions:
+
+- \( dp1(V) = C_V + \sum_{i=1}^{n} dp2(v_i) \), where \( n \) is the number of children of node \( V \), and \( v_i \) is the \( i \)-th child of node \( V \).  
+  This represents the scenario where node \( V \) is included in the chosen subset, so none of its children can be selected.
+
+- \( dp2(V) = \sum_{i=1}^{n} \text{max}(dp1(v_i), dp2(v_i)) \).  
+  This represents the scenario where node \( V \) is not included, so the optimal solution for each child \( v_i \) can either include or exclude that child.
+
+### Complexity:
+
+The time complexity for this approach is \( O(N) \), where \( N \) is the number of nodes in the tree. This is because the solution involves a depth-first search (DFS) traversal of the tree, and each node is visited only once.
+
+```cpp
+//pV is parent of V
+void dfs(int V, int pV)
+{
+    //base case:
+    //when dfs reaches a leaf it finds dp1 and dp2 and does not branch again.
+
+    //for storing sums of dp1 and max(dp1, dp2) for all children of V
+    int sum1 = 0, sum2 = 0;
+
+    //traverse over all children
+    for (auto v : adj[V])
+    {
+        if (v == pV)
+            continue;
+        dfs(v, V);
+        sum1 += dp2[v];
+        sum2 += max(dp1[v], dp2[v]);
+    }
+
+    dp1[V] = C[V] + sum1;
+    dp2[V] = sum2;
+}
+//Nodes are 1-indexed, therefore our answer stored in dp1[1] and dp2[1]
+//for the answer we take max(dp1[1],dp2[1]) after calling dfs(1,0).
+```
+
+---
+
+## References
+
+- [DP on Tree on CodeForces](https://codeforces.com/blog/entry/20935)
\ No newline at end of file
diff --git a/docs/dynamic-programming/dynamic_programming.md b/docs/dynamic-programming/dynamic_programming.md
new file mode 100644
index 0000000..1259a50
--- /dev/null
+++ b/docs/dynamic-programming/dynamic_programming.md
@@ -0,0 +1,111 @@
+# Dynamic Programming
+
+Dynamic programming (DP) is a technique used to avoid computing the same sub-solution multiple times in a recursive algorithm. A sub-solution of the problem is constructed from the previously found ones. DP solutions have a polynomial complexity, which ensures a much faster running time than other techniques like backtracking or brute-force.
+
+## Memoization - Top Down
+
+Memoization ensures that a method doesn’t run for the same inputs more than once by keeping a record of the results for the given inputs (usually in a hash map). 
+
+To avoid duplicate work caused by recursion, we can use a cache that maps inputs to outputs. The approach involves:
+
+- Checking the cache to see if we can avoid computing the answer for any given input.
+- Saving the results of any calculations to the cache.
+
+Memoization is a common strategy for dynamic programming problems where the solution is composed of solutions to the same problem with smaller inputs, such as the Fibonacci problem.
+
+Another strategy for dynamic programming is the **bottom-up** approach, which is often cleaner and more efficient.
+
+## Bottom-Up
+
+The bottom-up approach avoids recursion, saving the memory cost associated with building up the call stack. It "starts from the beginning" and works towards the final solution, whereas a recursive algorithm often "starts from the end and works backwards."
+
+## An Example - Fibonacci
+
+Let’s start with a well-known example: finding the \(n\)-th Fibonacci number. The Fibonacci sequence is defined as:
+
+\[
+F_n = F_{n−1} + F_{n−2}, \quad \text{with } F_0 = 0 \text{ and } F_1 = 1
+\]
+
+There are several approaches to solving this problem:
+
+### Recursion
+
+In a recursive approach, the function calls itself to compute the previous two Fibonacci numbers until reaching the base cases.
+
+```python
+def fibonacci(n):
+    if (n == 0):
+        return 0
+    if (n == 1):
+        return 1
+
+    return fibonacci(n - 1) + fibonacci(n - 2)
+```
+
+### Dynamic Programming
+
+- **Top-Down - Memoization:**  
+  Recursion leads to unnecessary repeated calculations. Memoization solves this by caching the results of previously computed Fibonacci numbers, so they don't have to be recalculated.
+
+```python
+cache = {}
+
+def fibonacci(n):
+    if (n == 0):
+        return 0
+    if (n == 1):
+        return 1
+    if (n in cache):
+        return cache[n]
+
+    cache[n] = fibonacci(n - 1) + fibonacci(n - 2)
+
+    return cache[n]
+```
+
+<figure markdown="span">
+![Recursive vs Memoization](img/recursive_memoization.png){ width="90%" }
+<figcaption>Visualization of Recursive Memoization</figcaption>
+</figure>
+
+
+- **Bottom-Up:**  
+  The bottom-up approach eliminates recursion by computing the Fibonacci numbers in order, starting from the base cases and building up to the desired value.
+
+```python
+cache = {}
+
+def fibonacci(n):
+    cache[0] = 0
+    cache[1] = 1
+
+    for (i in range(2, n + 1)):
+        cache[i] = cache[i - 1] + cache[i - 2]
+
+    return cache[n]
+```
+
+Additionally, this approach can be optimized further by using constant space and only storing the necessary partial results along the way.
+
+```python
+def fibonacci(n):
+    fib_minus_2 = 0
+    fib_minus_1 = 1
+
+    for (i in range(2, n + 1)):
+        fib = fib_minus_1 + fib_minus_2
+        fib_minus_1, fib_minus_2 = fib, fib_minus_1
+
+    return fib
+```
+
+## How to Apply Dynamic Programming?
+
+To apply dynamic programming, follow these steps:
+
+- **Find the recursion in the problem:** Identify how the problem can be broken down into smaller subproblems.
+- **Top-down approach:** Store the result of each subproblem in a table to avoid recomputation.
+- **Bottom-up approach:** Find the correct order to evaluate the results so that partial results are available when needed.
+
+Dynamic programming generally works for problems that have an inherent left-to-right order, such as strings, trees, or integer sequences. If the naive recursive algorithm does not compute the same subproblem multiple times, dynamic programming won't be useful.
\ No newline at end of file
diff --git a/docs/dynamic-programming/greedy_algorithms.md b/docs/dynamic-programming/greedy_algorithms.md
new file mode 100644
index 0000000..1063eee
--- /dev/null
+++ b/docs/dynamic-programming/greedy_algorithms.md
@@ -0,0 +1,182 @@
+# Greedy Algorithms
+
+A *greedy algorithm* is an algorithm that follows the problem solving heuristic of making the locally optimal choice at each stage with the hope of finding a global optimum. A greedy algorithm never takes back its choices, but directly constructs the final solution. For this reason, greedy algorithms are usually very efficient.
+
+The difficulty in designing greedy algorithms is to find a greedy strategy that always produces an optimal solution to the problem. The locally optimal choices in a greedy algorithm should also be globally optimal. It is often difficult to argue that a greedy algorithm works.
+
+## Coin Problem
+
+We are given a value \( V \). If we want to make change for \( V \) cents, and we have an infinite supply of each of the coins = { \( C_1, C_2, \dots, C_m \) } valued coins (sorted in descending order), what is the minimum number of coins to make the change?
+
+### Solution
+
+#### Approach:
+
+1. Initialize the result as empty.
+2. Find the largest denomination that is smaller than the amount.
+3. Add the found denomination to the result. Subtract the value of the found denomination from the amount.
+4. If the amount becomes 0, then print the result. Otherwise, repeat steps 2 and 3 for the new value of the amount.
+
+```python
+def min_coins(coins, amount):
+    n = len(coins)
+    for i in range(n):
+        while amount >= coins[i]:
+            # while loop is needed since one coin can be used multiple times
+            amount -= coins[i]
+            print(coins[i])
+```
+For example, if the coins are the euro coins (in cents) \({200, 100, 50, 20, 10, 5, 2, 1}\) and the amount is 548, the optimal solution is to select coins \(200+200+100+20+20+5+2+1\), whose sum is \(548\).
+
+<figure markdown="span">
+![Coin Change Problem](img/coin_change.png){ width="90%" }
+<figcaption>Visualization of the Coin Change Problem</figcaption>
+</figure>
+
+In the general case, the coin set can contain any kind of coins, and the greedy algorithm does not necessarily produce an optimal solution.
+
+We can prove that a greedy algorithm does not work by showing a counterexample where the algorithm gives a wrong answer. In this problem, we can easily find a counterexample: if the coins are \({6, 5, 2}\) and the target sum is \(10\), the greedy algorithm produces the solution \(6+2+2\), while the optimal solution is \(5+5\).
+
+## Scheduling
+
+Many scheduling problems can be solved using greedy algorithms. A classic problem is as follows:
+
+We are given an array of jobs where every job has a deadline and associated profit if the job is finished before the deadline. It is also given that every job takes a single unit of time, thus the minimum possible deadline for any job is 1. How do we maximize total profit if only one job can be scheduled at a time?
+
+### Solution
+
+A simple solution is to generate all subsets of the given set of jobs and check each subset for feasibility. Keep track of maximum profit among all feasible subsets. The time complexity of this solution is exponential. This is a standard Greedy Algorithm problem.
+
+#### Approach:
+
+1. Sort all jobs in decreasing order of profit.
+2. Initialize the result sequence as the first job in sorted jobs.
+3. For the remaining \(n-1\) jobs:
+   - If the current job can fit in the current result sequence without missing the deadline, add the current job to the result.
+   - Else ignore the current job.
+
+```python
+# sample job : ['x', 4, 25] −> [job_id, deadline, profit]
+# jobs: array of 'job's
+def print_job_scheduling(jobs, t):
+    n = len(jobs)
+    
+    # Sort all jobs according to decreasing order of profit
+    for i in range(n):
+        for j in range(n - 1 - i):
+            if jobs[j][2] < jobs[j + 1][2]:
+                jobs[j], jobs[j + 1] = jobs[j + 1], jobs[j]
+    
+    # To keep track of free time slots
+    result = [False] * t
+    # To store result (Sequence of jobs)
+    job = ['-1'] * t
+    
+    # Iterate through all given jobs
+    for i in range(len(jobs)):
+        # Find a free slot for this job
+        # (Note that we start from the last possible slot)
+        for j in range(min(t - 1, jobs[i][1] - 1), -1, -1):
+            # Free slot found
+            if result[j] is False:
+                result[j] = True
+                job[j] = jobs[i][0]
+                break
+    print(job)
+```
+
+## Tasks and Deadlines
+
+Let us now consider a problem where we are given \(n\) tasks with durations and deadlines, and our task is to choose an order to perform the tasks. For each task, we earn \(d - x\) points, where \(d\) is the task’s deadline and \(x\) is the moment when we finish the task. What is the largest possible total score we can obtain?
+
+For example, suppose the tasks are as follows:
+
+| Task | Duration | Deadline |
+|------|----------|----------|
+| A    | 4        | 2        |
+| B    | 3        | 5        |
+| C    | 2        | 7        |
+| D    | 4        | 5        |
+
+An optimal schedule for the tasks is \( C, B, A, D \). In this solution, \( C \) yields 5 points, \( B \) yields 0 points, \( A \) yields -7 points, and \( D \) yields -8 points, so the total score is -10.
+
+Interestingly, the optimal solution to the problem does not depend on the deadlines, but a correct greedy strategy is to simply perform the tasks sorted by their durations in increasing order.
+
+### Solution
+
+1. Sort all tasks according to increasing order of duration.
+2. Calculate the total points by iterating through all tasks, summing up the difference between the deadlines and the time at which the task is finished.
+
+```python
+def order_tasks(tasks):
+    n = len(tasks)
+
+    # Sort all task according to increasing order of duration
+    for (i in range(n)):
+        for (j in range(n - 1 - i)):
+            if (tasks[j][1] > tasks[j + 1][1]):
+                tasks[j], tasks[j + 1] = tasks[j + 1], tasks[j]
+
+    point = 0
+    current_time = 0
+    # Iterate through all given tasks and calculate point
+    for (i in range(len(tasks))):
+        current_time = current_time + tasks[i][1]
+        point = point + (tasks[i][2] - current_time)
+
+    print(point)
+```
+
+## Minimizing Sums
+
+We are given \(n\) numbers and our task is to find a value \(x\) that minimizes the sum:
+
+\[
+|a_1 − x|^c + |a_2 − x|^c + ... + |a_n − x|^c
+\]
+
+We focus on the cases \(c = 1\) and \(c = 2\).
+
+### Case \(c = 1\)
+
+In this case, we should minimize the sum:
+
+\[
+|a_1 − x| + |a_2 − x| + ... + |a_n − x|
+\]
+
+For example, if the numbers are \([1, 2, 9, 2, 6]\), the best solution is to select \(x = 2\), which produces the sum:
+
+\[
+|1 − 2| + |2 − 2| + |9 − 2| + |2 − 2| + |6 − 2| = 12
+\]
+
+In the general case, the best choice for \(x\) is the median of the numbers. For instance, the list \([1, 2, 9, 2, 6]\) becomes \([1, 2, 2, 6, 9]\) after sorting, so the median is 2. The median is an optimal choice because if \(x\) is smaller than the median, the sum decreases by increasing \(x\), and if \(x\) is larger, the sum decreases by lowering \(x\). Hence, the optimal solution is \(x = \text{median}\).
+
+### Case \(c = 2\)
+
+In this case, we minimize the sum:
+
+\[
+(a_1 − x)^2 + (a_2 − x)^2 + ... + (a_n − x)^2
+\]
+
+For example, if the numbers are \([1, 2, 9, 2, 6]\), the best solution is to select \(x = 4\), which produces the sum:
+
+\[
+(1 − 4)^2 + (2 − 4)^2 + (9 − 4)^2 + (2 − 4)^2 + (6 − 4)^2 = 46
+\]
+
+In the general case, the best choice for \(x\) is the average of the numbers. For the given example, the average is:
+
+\[
+\frac{(1 + 2 + 9 + 2 + 6)}{5} = 4
+\]
+
+This result can be derived by expressing the sum as:
+
+\[
+n x^2 − 2x(a_1 + a_2 + ... + a_n) + (a_1^2 + a_2^2 + ... + a_n^2)
+\]
+
+The last part does not depend on \(x\), so we can ignore it. The remaining terms form a function with a parabola opening upwards, and the minimum value occurs at \(x = \frac{s}{n}\), where \(s\) is the sum of the numbers, i.e., the average of the numbers.
\ No newline at end of file
diff --git a/docs/dynamic-programming/img/1st_power_matrix.png b/docs/dynamic-programming/img/1st_power_matrix.png
new file mode 100644
index 0000000..a26b31e
Binary files /dev/null and b/docs/dynamic-programming/img/1st_power_matrix.png differ
diff --git a/docs/dynamic-programming/img/3rd_power_matrix.png b/docs/dynamic-programming/img/3rd_power_matrix.png
new file mode 100644
index 0000000..5be7c79
Binary files /dev/null and b/docs/dynamic-programming/img/3rd_power_matrix.png differ
diff --git a/docs/dynamic-programming/img/coin_change.png b/docs/dynamic-programming/img/coin_change.png
new file mode 100644
index 0000000..e7d82b7
Binary files /dev/null and b/docs/dynamic-programming/img/coin_change.png differ
diff --git a/docs/dynamic-programming/img/left_childright_sibling.png b/docs/dynamic-programming/img/left_childright_sibling.png
new file mode 100644
index 0000000..7e4c01a
Binary files /dev/null and b/docs/dynamic-programming/img/left_childright_sibling.png differ
diff --git a/docs/dynamic-programming/img/recursive_memoization.png b/docs/dynamic-programming/img/recursive_memoization.png
new file mode 100644
index 0000000..514eb8a
Binary files /dev/null and b/docs/dynamic-programming/img/recursive_memoization.png differ
diff --git a/docs/dynamic-programming/index.md b/docs/dynamic-programming/index.md
index 264501c..d9ede23 100644
--- a/docs/dynamic-programming/index.md
+++ b/docs/dynamic-programming/index.md
@@ -3,3 +3,36 @@ title: Dynamic Programming
 tags:
     - "Dynamic Programming"
 ---
+
+**Editor:** Halil Çetiner
+
+**Reviewers:** Onur Yıldız
+
+## Introduction
+Next section is about the *Greedy Algorithms* and *Dynamic Programming*. It will be quite a generous introduction to the concepts and will be followed by some common problems.
+
+- [Greedy Algorithms](./greedy_algorithms.md)
+
+- [Dynamic Programming](./dynamic_programming.md)
+
+- [Common DP Problems](./common_dp_problems.md)
+
+- [Bitmask DP](./bitmask_dp.md)
+
+- [DP on Rooted Trees](./dp_on_rooted_trees.md)
+
+- [DP on Directed Acyclic Graphs](./dp_on_directed_acyclic_graphs.md)
+
+- [Digit DP](./digit_dp.md)
+
+- [Walk Counting using Matrix Exponentiation](./walk_counting_with_matrix.md)
+
+- [Tree Child-Sibling Notation](./tree_child_sibling_notation.md)
+
+## References
+
+1. ["Competitive Programmer’s Handbook" by Antti Laaksonen - Draft July 3, 2018](https://cses.fi/book/book.pdf)
+2. [Wikipedia - Dynamic Programming](https://en.wikipedia.org/wiki/Dynamic_programming)
+3. [Topcoder - Competitive Programming Community / Dynamic Programming from Novice to Advanced](https://www.topcoder.com/community/competitive-programming/tutorials/dynamic-programming-from-novice-to-advanced/)
+4. [Hacker Earth - Dynamic Programming](https://www.hackerearth.com/practice/algorithms/dynamic-programming/)
+5. [Geeks for Geeks - Dynamic Programming](https://www.geeksforgeeks.org/dynamic-programming/)
diff --git a/docs/dynamic-programming/tree_child_sibling_notation.md b/docs/dynamic-programming/tree_child_sibling_notation.md
new file mode 100644
index 0000000..a6f7f9f
--- /dev/null
+++ b/docs/dynamic-programming/tree_child_sibling_notation.md
@@ -0,0 +1,46 @@
+# Tree Child-Sibling Notation
+
+In this method, we change the structure of the tree. In a standard tree, each parent node is connected to all of its children. However, in the **child-sibling notation**, a node stores a pointer to only one of its children. Additionally, the node also stores a pointer to its immediate right sibling. 
+
+In this notation, every node has at most 2 children:
+- **Left child** (first child),
+- **Right sibling** (first sibling).
+
+This structure is called the **LCRS (Left Child-Right Sibling)** notation. It effectively represents a binary tree, as every node has only two pointers (left and right).
+
+<figure markdown="span">
+![Child-Sibling Notation](img/left_childright_sibling.png){ width="90%" }
+<figcaption> a tree notated with child-sibling notation</figcaption>
+</figure>
+
+## Why You Would Use the LCRS Notation
+
+The primary reason for using LCRS notation is to save memory. In the LCRS structure, less memory is used compared to the standard tree notation.
+
+### When You Might Use the LCRS Notation:
+
+- **Memory is extremely scarce.**
+- **Random access to a node’s children is not required.**
+
+### Possible Cases for Using LCRS:
+
+1. **When storing a large multi-way tree in main memory:**  
+   For example, [the phylogenetic tree](https://en.wikipedia.org/wiki/Phylogenetic_tree).
+
+2. **In specialized data structures where the tree is used in specific ways:**  
+   For example, in the [**heap data structure**](https://en.wikipedia.org/wiki/Heap_%28data_structure%29), the main operations are:
+   
+   - Removing the root of the tree and processing each of its children,
+   - Joining two trees together by making one tree a child of the other.
+
+These operations can be done efficiently using the LCRS structure, making it convenient for working with heap data structures.
+
+---
+
+## References
+
+- [LCRS article on Wikipedia](https://en.wikipedia.org/wiki/Left-child_right-sibling_binary_tree8)
+
+- [Link to the Figure used](https://contribute.geeksforgeeks.org/wp-content/uploads/new.jpeg)
+
+- [LCRS possible uses Stackoverflow](https://stackoverflow.com/questions/14015525/what-is-the-left-child-right-sibling-representation-of-a-tree-why-would-you-us)
\ No newline at end of file
diff --git a/docs/dynamic-programming/walk_counting_with_matrix.md b/docs/dynamic-programming/walk_counting_with_matrix.md
new file mode 100644
index 0000000..a6d66fb
--- /dev/null
+++ b/docs/dynamic-programming/walk_counting_with_matrix.md
@@ -0,0 +1,55 @@
+# Walk Counting using Matrix Exponentiation
+
+Matrix exponentiation can be used to count the number of walks of a given length on a graph.
+
+Let \( l \) be the desired walk length, and let \( A \) and \( B \) be nodes in a graph \( G \). If \( D \) is the adjacency matrix of \( G \), then \( D^l[A][B] \) represents the number of walks from node \( A \) to node \( B \) with length \( l \), where \( D^k \) denotes the \( k \)-th power of the matrix \( D \).
+
+---
+
+## Explanation:
+
+- **Adjacency Matrix \( D \):**  
+  In the adjacency matrix of a graph, each entry \( D[i][j] \) denotes whether there is a direct edge between node \( i \) and node \( j \). Specifically:
+  - \( D[i][j] = 1 \) if there is an edge from \( i \) to \( j \),
+  - \( D[i][j] = 0 \) otherwise.
+
+- **Matrix Exponentiation:**  
+  To find the number of walks of length \( l \) between nodes \( A \) and \( B \), we need to compute \( D^l \), which is the \( l \)-th power of the adjacency matrix \( D \). The entry \( D^l[A][B] \) will then give the number of walks of length \( l \) from node \( A \) to node \( B \).
+
+```mermaid
+graph LR
+    A(2) --> B(1);
+    B --> C(3);
+    C --> A;
+    C --> D(4);
+    D --> C;
+```
+
+
+| ![D, adjacency matrix of G](img/1st_power_matrix.png){ width="50%" } | ![D^3, 3rd power of the matrix D](img/3rd_power_matrix.png){ width="50%" } |
+|:--------------------------------------------------------------------:|:--------------------------------------------------------------------------:|
+| D, adjacency matrix of G                                              | D^3, 3rd power of the matrix D                                               |
+
+
+From the matrix \( D^3 \), we can see that there are 4 total walks of length 3.
+
+Let \( S \) be the set of walks, and let \( w \) be a walk where \( w = \{n_1, n_2, ..., n_k\} \) and \( n_i \) is the \( i \)-th node of the walk. Then:
+
+\[
+S = \{\{1, 3, 4, 3\}, \{3, 4, 3, 2\}, \{3, 4, 3, 4\}, \{4, 3, 4, 3\}\}
+\]
+and \( |S| = 4 \).
+
+Using fast exponentiation on the adjacency matrix, we can efficiently find the number of walks of length \( k \) in \( O(N^3 \log k) \) time, where \( N \) is the number of nodes in the graph.
+
+### Time Complexity Breakdown:
+- [**Matrix Multiplication:**](https://en.wikipedia.org/wiki/Matrix_multiplication) The \( O(N^3) \) time complexity comes from multiplying two \( N \times N \) matrices.
+- **Fast Exponentiation:** Fast exponentiation reduces the number of multiplications to \( \log k \), resulting in the overall time complexity of \( O(N^3 \log k) \).
+
+This method allows for efficiently calculating the number of walks with any length \( k \) in large graphs.
+
+---
+
+## References
+
+- [Walk Counting on Sciencedirect](https://www.sciencedirect.com/science/article/pii/S0012365X08002008)
\ No newline at end of file
diff --git a/mkdocs.yml b/mkdocs.yml
index 90036c7..47a84d3 100644
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -66,7 +66,11 @@ markdown_extensions:
       generic: true
   - footnotes
   - pymdownx.details
-  - pymdownx.superfences
+  - pymdownx.superfences:
+      custom_fences:
+        - name: mermaid
+          class: mermaid
+          format: !!python/name:pymdownx.superfences.fence_code_format
   - pymdownx.mark
   - attr_list
   - pymdownx.emoji: