Find the longest subsequence of a given array such that all elements of the subsequence are sorted in increasing order – Dynamic Programming
The Longest Increasing Subsequence (LIS) problem is a classical problem in dynamic programming. Our goal is to find the length of the longest subsequence in an array where the elements are in strictly increasing order.
Problem Explanation
Given an array arr[]
of size n
, we need to find the length of the longest subsequence such that all elements of the subsequence are sorted in increasing order.
Example
- Input:
arr[] = [10, 22, 9, 33, 21, 50, 41, 60, 80]
- Output:
6
(The LIS is[10, 22, 33, 50, 60, 80]
)
Approach
We’ll solve this problem using dynamic programming. The key observation is that the solution for a given element depends on the solutions to the previous elements in the array.
Steps:
- Create a DP array (
dp[]
) where each elementdp[i]
represents the length of the longest increasing subsequence that ends at indexi
. - Initialize the DP array with
1
for all elements because the minimum subsequence length ending at any element is1
(the element itself). - For each element in the array (starting from the second element), check all previous elements to see if a valid increasing subsequence can be formed. If
arr[j] < arr[i]
, updatedp[i] = max(dp[i], dp[j] + 1)
because we can append the current elementarr[i]
to the increasing subsequence ending atarr[j]
. - After processing all elements, the maximum value in the
dp[]
array will be the length of the longest increasing subsequence.
Code
def longest_increasing_subsequence(arr): n = len(arr) # Base case: if array is empty if n == 0: return 0 # Step 1: Create a dp array and initialize it with 1 dp = [1] * n # Step 2: Fill dp array by comparing each element with previous ones for i in range(1, n): for j in range(0, i): if arr[i] > arr[j]: dp[i] = max(dp[i], dp[j] + 1) # Step 3: The length of the longest increasing subsequence is the maximum value in dp array lis_length = max(dp) # Step 4: Recovering the actual LIS sequence (Optional) lis_sequence = [] current_length = lis_length # Traverse the dp array from the end to the start to find the actual sequence for i in range(n - 1, -1, -1): if dp[i] == current_length: lis_sequence.append(arr[i]) current_length -= 1 # Reverse the sequence since we collected it in reverse order lis_sequence.reverse() # Returning both the length and the actual LIS return lis_length, lis_sequence # Test the function arr = [10, 22, 9, 33, 21, 50, 41, 60, 80] length, sequence = longest_increasing_subsequence(arr) print("Length of LIS:", length) print("LIS sequence:", sequence)
Explanation of the Code
- Input and Edge Case Handling:
- We first check if the array is empty, in which case the LIS length would be
0
.
- Initialization:
- We create a
dp[]
array initialized to1
for all elements. This is because every element on its own is an increasing subsequence of length1
.
- Filling the DP Array:
- We iterate over each element
arr[i]
starting from the second element. - For each element
arr[i]
, we check all previous elementsarr[j]
(wherej < i
). Ifarr[j] < arr[i]
, we update thedp[i]
asdp[i] = max(dp[i], dp[j] + 1)
. This means that we can append the current elementarr[i]
to the subsequence ending atarr[j]
and extend its length by1
.
- Finding the Maximum LIS Length:
- After filling the
dp[]
array, the maximum value indp[]
gives us the length of the LIS.
- Recovering the Actual Sequence (Optional):
- To reconstruct the LIS sequence, we start from the end of the
dp[]
array and trace back using the LIS length. We collect the elements that form the LIS in reverse order and then reverse the sequence at the end.
Time Complexity
The time complexity of this approach is O(n^2) because we have two nested loops: one to pick each element arr[i]
and another to compare it with all previous elements arr[j]
. The space complexity is O(n) due to the dp[]
array.
Output
For the input arr = [10, 22, 9, 33, 21, 50, 41, 60, 80]
, the output will be:
Length of LIS: 6 LIS sequence: [10, 22, 33, 50, 60, 80]
Optimized Solution (Binary Search Approach)
The optimized solution for the Longest Increasing Subsequence (LIS) problem uses a combination of binary search and a greedy approach. This algorithm runs in O(n log n), which is much faster than the dynamic programming approach with O(n²) complexity.
Key Idea
The key idea is to maintain a tail array (or list) where:
- Each element in
tail[]
represents the smallest possible ending value of an increasing subsequence of a particular length.
We can efficiently find the position where a new element should be inserted into this tail array using binary search.
Steps:
- Initialize an empty tail array (
tail[]
). - For each element in the input array
arr[]
, use binary search to find its correct position intail[]
. The position is determined by finding the smallest element intail[]
that is greater than or equal to the current element. - Update the tail array:
- If the current element can extend the largest subsequence so far, append it to
tail[]
. - Otherwise, replace the element in
tail[]
with the current element to maintain the smallest possible ending element for a subsequence of that length.
- The length of the
tail[]
array will give the length of the longest increasing subsequence.
Code (Optimized Solution)
import bisect def longest_increasing_subsequence_optimized(arr): # Step 1: Initialize an empty tail array tail = [] # Step 2: Iterate through the array for num in arr: # Step 3: Find the position where the current element should be placed pos = bisect.bisect_left(tail, num) # Step 4: If num is greater than all elements in tail, append it if pos == len(tail): tail.append(num) else: # Otherwise, replace the existing element to maintain the minimum possible value tail[pos] = num # Step 5: The length of the tail array is the length of the LIS return len(tail), tail # Test the function arr = [10, 22, 9, 33, 21, 50, 41, 60, 80] length, tail_sequence = longest_increasing_subsequence_optimized(arr) print("Length of LIS:", length) print("Final tail sequence (not necessarily the LIS):", tail_sequence)
Explanation
- Binary Search with
bisect_left
:
- We use
bisect_left(tail, num)
to find the index wherenum
should be placed intail[]
.bisect_left
finds the first position wherenum
is greater than or equal to the elements intail[]
.
- Updating the
tail[]
Array:
- If the current number is larger than all elements in
tail[]
, we append it (because it extends the LIS). - If the current number is smaller or equal to some elements in
tail[]
, we replace the first element intail[]
that is greater than or equal to the current number. This replacement ensures that thetail[]
array holds the smallest possible ending values for subsequences of different lengths.
- Length of LIS:
- The size of the
tail[]
array at the end of the process will be the length of the LIS.
- Important Note:
- The
tail[]
array does not store the actual LIS, but it helps to find the length of the LIS. The values intail[]
represent potential candidates for the smallest elements of increasing subsequences of different lengths.
Example Walkthrough
Let’s walk through the array arr = [10, 22, 9, 33, 21, 50, 41, 60, 80]
step by step:
- Start with an empty
tail[]
. - First element
10
: Insert at position0
→tail = [10]
- Next element
22
: Insert at position1
→tail = [10, 22]
- Next element
9
: Insert at position0
(replaces10
) →tail = [9, 22]
- Next element
33
: Insert at position2
→tail = [9, 22, 33]
- Next element
21
: Insert at position1
(replaces22
) →tail = [9, 21, 33]
- Next element
50
: Insert at position3
→tail = [9, 21, 33, 50]
- Next element
41
: Insert at position3
(replaces50
) →tail = [9, 21, 33, 41]
- Next element
60
: Insert at position4
→tail = [9, 21, 33, 41, 60]
- Last element
80
: Insert at position5
→tail = [9, 21, 33, 41, 60, 80]
At the end, the length of the tail[]
is 6
, which is the length of the LIS.
Time Complexity
- Binary search using
bisect_left
takes O(log n), and we perform it for each element in the array, so the overall time complexity is O(n log n). - The space complexity is O(n) because we maintain the
tail[]
array.
Output
For the input arr = [10, 22, 9, 33, 21, 50, 41, 60, 80]
, the output will be:
Length of LIS: 6 Final tail sequence (not necessarily the LIS): [9, 21, 33, 41, 60, 80]
Key Takeaways
- The
tail[]
array helps us to efficiently track potential LIS candidates, even though it doesn’t directly store the LIS. - The O(n log n) approach is much faster than the O(n²) dynamic programming approach, especially for large arrays.