Tutorial: Remove Duplicates from Sorted Array - Python Solution

In this tutorial, we will delve into the “Remove Duplicates from Sorted Array” problem, a common interview question that tests your ability to manipulate arrays and pointers efficiently. We will go through the problem statement, break down the required steps, and provide a Python solution along with a detailed explanation. By the end of this tutorial, you will have a solid understanding of the problem and how to solve it effectively.

Problem Statement

Given a sorted array, your task is to remove the duplicate elements in-place such that each element appears only once, and the relative order of the elements remains unchanged. The function should return the new length of the modified array.

Here’s the formal problem statement:

Input: A sorted array of integers.

Output: The length of the modified array with duplicates removed.

Example:

Input: nums = [1, 1, 2, 2, 3, 4, 4, 4, 5]
Output: 5
Modified Array: [1, 2, 3, 4, 5]

Approach

To solve this problem, we can utilize the property of a sorted array. Since the array is sorted, duplicate elements will be adjacent to each other. Our goal is to iterate through the array, compare each element with the next element, and remove duplicates in-place.

Here’s a step-by-step approach:

Initialize a pointer slow at the beginning of the array and another pointer fast at slow + 1.
Iterate while fast is within the bounds of the array:

If nums[fast] is equal to nums[slow], increment fast to skip duplicates.
If nums[fast] is different from nums[slow], increment slow and assign nums[fast] to nums[slow].

Return slow + 1 as the length of the modified array (since indexing is 0-based).

Python Solution

Let’s implement the approach in Python code:

def removeDuplicates(nums):
    if not nums:
        return 0  # Empty array case

    slow = 0

    for fast in range(1, len(nums)):
        if nums[fast] != nums[slow]:
            slow += 1
            nums[slow] = nums[fast]

    return slow + 1

Explanation

We begin by checking if the input array nums is empty. If it is, we return 0, as the modified array will also be empty.
We initialize a pointer slow to keep track of the position where the next unique element should be placed.
We start a loop using a pointer fast, which iterates over the array starting from index 1. This pointer helps us traverse the array while skipping duplicates.
Inside the loop, we compare the element at nums[fast] with the element at nums[slow]. If they are equal, we’ve encountered a duplicate. In this case, we increment fast to move to the next element, effectively skipping the duplicates.
If the elements are not equal, we’ve found a new unique element. We increment the slow pointer to move to the next position where the unique element should be placed. Then, we assign the value of nums[fast] to nums[slow], effectively overwriting the duplicate element at slow with the new unique element at fast.
The loop continues until fast reaches the end of the array. During this process, duplicates are skipped, and only unique elements are copied to the modified portion of the array.
Finally, we return slow + 1 as the length of the modified array. Since slow represents the index of the last unique element, adding 1 gives us the count of unique elements.

Complexity Analysis

Time Complexity: The algorithm iterates through the input array exactly once, so the time complexity is O(n), where n is the number of elements in the array.
Space Complexity: The algorithm operates in-place, so the space complexity is O(1) as we are not using any additional data structures that depend on the input size.

Testing the Solution

Let’s test the solution with some example cases and custom cases:

# Example case
nums = [1, 1, 2, 2, 3, 4, 4, 4, 5]
print("Original Array:", nums)
new_length = removeDuplicates(nums)
print("Modified Array:", nums[:new_length])
print("New Length:", new_length)

# Custom cases
test_cases = [
    [1, 1, 2, 2, 2, 3, 4, 5, 5, 6, 6, 6, 6],
    [10, 20, 30, 40, 50],
    [1, 1, 1, 1, 1, 1],
    [],
]
for case in test_cases:
    print("Original Array:", case)
    new_length = removeDuplicates(case)
    print("Modified Array:", case[:new_length])
    print("New Length:", new_length)

Conclusion

In this tutorial, you have learned how to solve the “Remove Duplicates from Sorted Array” problem efficiently using a two-pointer approach. By carefully comparing elements and overwriting duplicates in-place, you can achieve a linear time complexity solution. This problem is a great example of how sorting and manipulating arrays can be optimized with the help of pointers, making it a valuable technique to have in your coding arsenal.

Tutorial: Remove Duplicates from Sorted Array – Python Solution