CTCI in Python: Implement an algorithm to determine if a string has all unique characters.
What if you cannot use additional data structures?
This is the first question of the famous book Cracking the Coding Interview.
The book offers only a Java solution.
Here are my implementation in Python assuming the string is in ASCII (since it is ASCII the alphabet has 128 characters)
Solution 1 — First using a Set
We are evaluating each character of the string. If we already visited the character we stop otherwise we add it to a set and continue
def unique_with_set(str_input):
if len(str_input) > 128:
return False
data = set()
for c in str_input:
if c in data:
return False
else:
data.add(c)
return True
complexity: adding or getting in a set (or a dictionary) has O(1) complexity in average and O(n) in worse case
Solution 2 — Without using extra structure
We use each bit of “data” to save the visited characters. This is a close implementation of the method used in the solution of the book.
def unique_no_struct(str_input):
if len(str_input) > 128:
return False
data = 0
for c in str_input:
offset_tmp = ord(c) - ord('a')
mask = 1 << offset_tmp
if mask & data:
return False
else:
data = data | mask
return True
Complexity of len() is O(1) (function of an object)
data is an int, in python it’s hold on 24 bytes and you can check it:
Solution 3 — python style
we are adding each character of the string in a set (so removing duplicate) and comparing the size of the original string
def unique_3(s):
return len(set(s)) == len(s)
complexity is O(n).
Here is the benchmark
The last implementation is the fastest.
Conclusion
Python has some optimization in its build in methods which change totally the approach of implementation.
Please add a comment for more pythonic/optimize solution