287. Find the Duplicate Number
The Pigeonhole principle

Hi! My name is Nhut Nguyen. I am a software engineer and a writer in Copenhagen, Denmark.
Learn more about me at nhutnguyen.com
Problem statement
Given an array of integers nums containing n + 1 integers where each integer is in the range [1, n] inclusive.
There is only one repeated number in nums, return this repeated number.
You must solve the problem without modifying the array nums and uses only constant extra space.
Example 1
Input: nums = [1,3,4,2,2]
Output: 2
Example 2
Input: nums = [3,1,3,4,2]
Output: 3
Constraints
1 <= n <= 10^5.nums.length == n + 1.1 <= nums[i] <= n.All the integers in
numsappear only once except for precisely one integer which appears two or more times.
Follow up:
How can we prove that at least one duplicate number must exist in
nums?Can you solve the problem in linear runtime complexity?
Solution 1: Sorting
#include <vector>
#include <iostream>
#include <algorithm>
using namespace std;
int findDuplicate(vector<int>& nums) {
sort(nums.begin(), nums.end());
for (int i = 0; i < nums.size() - 1; i++) {
if (nums[i] == nums[i + 1]) {
return nums[i];
}
}
return 0;
}
int main() {
vector<int> nums{1,3,4,2,2};
cout << findDuplicate(nums) << endl;
nums = {3,1,3,4,2};
cout << findDuplicate(nums) << endl;
}
Output:
2
3
Complexity
Runtime:
O(NlogN), whereN = nums.length.Extra space:
O(1).
Follow up
How can we prove that at least one duplicate number must exist in nums?
Due to Pigeonhole principle:
Here there are n + 1 pigeons in n holes. The pigeonhole principle says that at least one hole has more than one pigeon.
Can you solve the problem in linear runtime complexity?
Here are a few solutions.
Solution 2: Marking the visited numbers
Code
#include <vector>
#include <iostream>
using namespace std;
int findDuplicate(vector<int>& nums) {
vector<bool> visited(nums.size());
for (int a : nums) {
if (visited[a]) {
return a;
}
visited[a] = true;
}
return 0;
}
int main() {
vector<int> nums{1,3,4,2,2};
cout << findDuplicate(nums) << endl;
nums = {3,1,3,4,2};
cout << findDuplicate(nums) << endl;
}
Output:
2
3
Complexity
Runtime:
O(N), whereN = nums.length.Extra space:
O(logN).std::vector<bool>is optimized for space-efficient.
Solution 3: Marking with std::bitset
Since n <= 10^5, you can use this size for a std::bitset to do the marking.
Code
#include <vector>
#include <iostream>
#include <bitset>
using namespace std;
int findDuplicate(vector<int>& nums) {
bitset<100001> visited;
for (int a : nums) {
if (visited[a]) {
return a;
}
visited[a] = 1;
}
return 0;
}
int main() {
vector<int> nums{1,3,4,2,2};
cout << findDuplicate(nums) << endl;
nums = {3,1,3,4,2};
cout << findDuplicate(nums) << endl;
}
Output:
2
3
Complexity
Runtime:
O(N), whereN = nums.length.Extra space:
O(1).
References
Thanks for reading. Feel free to share your thought about my content and check out my FREE book “10 Classic Coding Challenges”.






