Problem: Median of Two Sorted Arrays: Given two sorted arrays, find the median of the combined sorted array.
Topic: Divide and Conquer, Binary Search
Median of Two Sorted Arrays
Finding the median of two sorted arrays is a classic problem in computer science and can be efficiently solved using divide-and-conquer strategies and binary search. This problem falls into the realm of divide and conquer and binary search, leveraging these techniques to achieve an optimal solution.
Problem Description
Given two sorted arrays, A
and B
, of sizes m
and n
respectively, the task is to find the median of the combined sorted array. The challenge is to solve this problem efficiently, ideally with a time complexity of (O(\log(\min(m, n)))), where m
and n
are the lengths of the two arrays.
Approach
To find the median of two sorted arrays efficiently, we can use a binary search approach. This method involves:
- Ensuring the Smaller Array: Always perform the binary search on the smaller of the two arrays. This ensures that our approach is efficient and leverages the properties of binary search effectively.
- Binary Search to Partition Arrays: Partition both arrays into two halves such that the left halves contain elements less than or equal to the elements in the right halves. This allows us to find the median based on these partitions.
- Maintaining Balance: Ensure that the partitions are balanced so that the number of elements in the left half of the combined array is equal to the number of elements in the right half (or differs by one).
Detailed Steps and Code
Step 1: Define the Helper Function
To make our main function clearer, we’ll first define a helper function that performs binary search on the smaller array.
def findMedianSortedArrays(nums1, nums2): # Ensure nums1 is the smaller array if len(nums1) > len(nums2): nums1, nums2 = nums2, nums1 x, y = len(nums1), len(nums2) low, high = 0, x while low <= high: partitionX = (low + high) // 2 partitionY = (x + y + 1) // 2 - partitionX # If partitionX is 0 it means nothing is there on left side. Use -inf for maxLeftX # If partitionX is length of input then there is nothing on right side. Use +inf for minRightX maxLeftX = float('-inf') if partitionX == 0 else nums1[partitionX - 1] minRightX = float('inf') if partitionX == x else nums1[partitionX] maxLeftY = float('-inf') if partitionY == 0 else nums2[partitionY - 1] minRightY = float('inf') if partitionY == y else nums2[partitionY] if maxLeftX <= minRightY and maxLeftY <= minRightX: # We have partitioned array at correct place if (x + y) % 2 == 0: return (max(maxLeftX, maxLeftY) + min(minRightX, minRightY)) / 2 else: return max(maxLeftX, maxLeftY) elif maxLeftX > minRightY: # We are too far on right side for partitionX. Go on left side. high = partitionX - 1 else: # We are too far on left side for partitionX. Go on right side. low = partitionX + 1
Detailed Explanation of the Algorithm
Step 1: Ensure the Smaller Array
First, ensure that nums1
is the smaller array. This simplifies the problem because we perform binary search on the smaller array, which minimizes the time complexity.
if len(nums1) > len(nums2): nums1, nums2 = nums2, nums1
Step 2: Initialize Variables
x
andy
are the lengths ofnums1
andnums2
respectively.low
andhigh
represent the range of indices where we perform the binary search.
x, y = len(nums1), len(nums2) low, high = 0, x
Step 3: Perform Binary Search
- Partitioning: Calculate the partition indices for both arrays. The goal is to partition
nums1
andnums2
such that the left part contains the same number of elements (or one more) than the right part.
partitionX = (low + high) // 2 partitionY = (x + y + 1) // 2 - partitionX
- Boundary Conditions: Handle edge cases where partitions might be at the boundaries of the arrays. Use
float('-inf')
andfloat('inf')
to represent the extremes.
maxLeftX = float('-inf') if partitionX == 0 else nums1[partitionX - 1] minRightX = float('inf') if partitionX == x else nums1[partitionX] maxLeftY = float('-inf') if partitionY == 0 else nums2[partitionY - 1] minRightY = float('inf') if partitionY == y else nums2[partitionY]
- Correct Partition Check: Check if the current partitions are correct. If
maxLeftX <= minRightY
andmaxLeftY <= minRightX
, then we have a valid partition.
if maxLeftX <= minRightY and maxLeftY <= minRightX: if (x + y) % 2 == 0: return (max(maxLeftX, maxLeftY) + min(minRightX, minRightY)) / 2 else: return max(maxLeftX, maxLeftY)
- Adjust Partitions: If the partitions are not valid, adjust the
low
andhigh
indices to move the partition in the correct direction.
elif maxLeftX > minRightY: high = partitionX - 1 else: low = partitionX + 1
Example
Consider the following example arrays:
nums1 = [1, 3]
nums2 = [2]
The combined sorted array would be [1, 2, 3]
. The median is 2
, which is straightforward to compute as the middle element.
Code Execution:
nums1 = [1, 3] nums2 = [2] median = findMedianSortedArrays(nums1, nums2) print("Median:", median) # Output: Median: 2.0
Edge Cases
- Arrays of Different Lengths: The algorithm handles arrays of different lengths effectively due to its binary search nature.
- Empty Arrays: The algorithm assumes non-empty arrays. If either array can be empty, handle that case by directly computing the median of the non-empty array.
- All Elements Identical: If all elements are the same across both arrays, the median will be that repeated value.
In this solution, we have:
- Explained the Problem: Understanding the task of finding the median of two sorted arrays.
- Described the Approach: Using binary search on the smaller array to efficiently find the median.
- Provided Detailed Code: Step-by-step code implementation and explanation.
- Discussed Edge Cases: Considered various scenarios and how the algorithm handles them.
This approach ensures an efficient (O(\log(\min(m, n)))) solution by leveraging the properties of binary search and array partitioning, making it suitable for large datasets.