Problem
This problem was asked by Facebook.
Given an array of strings (all lowercase letters), the task is to group them so that all strings in a group are shifted versions of each other. Two strings S and T are called shifted if,
S.length = T.length
and
S[i] = T[i] + K for
1 <= i <= S.length for a constant integer K
For example, strings, {acd, dfg, wyz, yab, mop} are shifted versions of each other.
Join 150K+ tech leaders and get insights on the most important ideas in AI straight to your inbox through my free newsletter- AI Made Simple
Step 1: Understanding the Ask
Problems such as this can be very confusing at first. This is because it’s very easy to view them as comparison problems (where we have to compare two or more strings with each other). This is more of a grouping problem. We want to group shifted strings together. This leads to a natural problem. How do we know what the strings are shifted by?
An easy is to loop through the string and look at the difference b/w consecutive characters. Each difference can be appened to a string. For example, the string “fgd” would have the corresponding string, “21”. We’ll call this string the difference string. When implementing this, remember to consider the case where we have a wraparound. Would the difference for “za” be 1 or -25 (or something else entirely)? Whatever you choose, make sure to spell it out clearly to show your ability to consider the edge cases.
We have now reframed this as: group the sets by the difference string. Assuming that we could create a difference string for any string input, is there a data structure that can group them by their respective difference string? Furthermore, we want a data structure that will be able to have quick retrieval, since we want to look at the difference string multiple times.
Step 2: Enter the HashSet
Anytime your problem involves a lot of looking, and grouping by a unique key, think of a hashset. They have constant lookup, so it’s cheap to see if a key exists or to modify the values associated with a key.
So what would the hashset look like? The key will be the difference string. The value will be a list of all the strings with that difference string. See how close that gets us to the solution? Our solution will conceptually look like this:
groups=dict()
for str in strings:
diff=diffString(str) #to be implemented later
if diff in groups:
groups[diff].append(str)
else:
groups[diff]=[str]
return groups
Now the final piece of the puzzle is implementing the diffString protocol. How do we do that?
Step 3: Diff String
As a general rule, you want to use helper functions when you can. The benefits are many:
- They let you put off implementing harder/different things till later while solving the problem. This is crucial to smoothly interviewing/coding.
- Shows your ability to organize and be methodical in your answers
- Makes your solutions much neater.
What should our diffString solution look like? We know it takes a string as input, and returns another one. Since we have to consider the difference between consecutive pairs of chars, we loop through the string. Putting things together it looks like:
def getDiffString(s):
shift=""
for i in range(1, len(s)): # we start at 1 not 0
dif = (ord(s[i]) -
ord(s[i - 1]))
if(dif < 0):
dif += 26 #"za" gives 1
# Representing the difference
# as char
shift += str(dif) #Remember to cast your difference to strign
# This string will be 1 less
# length than str
return shift
Step 4: Combining it all
To finally combine the solution we get:
def getDiffString(s):
shift=""
for i in range(1, len(s)): # we start at 1 not 0
dif = (ord(s[i]) -
ord(s[i - 1]))
if(dif < 0):
dif += 26 #"za" gives 1
# Representing the difference
# as char
shift += str(dif) #Remember to cast your difference to strign
# This string will be 1 less
# length than str
return shift
def groupShiftedString(str,n):
groupMap = dict()
for i in range(n):
diffStr = getDiffString(str[i])
if diffStr not in groupMap:
groupMap[diffStr] = [i]
else:
groupMap[diffStr].append(i)
return groupMap
What are the time and space complexities of this solution:
Time
We loop through the list of input strings. Additionally each string is looped through. If the length of the list is n and the average length of the strings in the list is m, we have the complexity O(n*m). Since lookup and insertion is constant time in hashets, they don’t add too much more.
Space
We store the list of input strings. Additionally each string has a diff string that is stored as a key. If the length of the list is n and the average number of the diff-strings in the list is m, we have the complexity O(n+m).
Final: Time- O(n*m) and Space- O(n+m)
Save the time, energy, and money you would burn by going through all those videos, courses, products, and ‘coaches’ and easily find all your needs met in one place at ‘Tech Made Simple’! Stay ahead of the curve in AI, software engineering, and the tech industry with expert insights, tips, and resources. 20% off for new subscribers by clicking this link. Subscribe now and simplify your tech journey!
Using this discount will drop the prices-
800 INR (10 USD) → 640 INR (8 USD) per Month
8000 INR (100 USD) → 6400INR (80 USD) per year (533 INR /month)
Reach out to me
Use the links below to check out my other content, learn more about tutoring, reach out to me about projects, or just to say hi.
Small Snippets about Tech, AI and Machine Learning over here
AI Newsletter- https://artificialintelligencemadesimple.substack.com/
My grandma’s favorite Tech Newsletter- https://codinginterviewsmadesimple.substack.com/
Check out my other articles on Medium. : https://rb.gy/zn1aiu
My YouTube: https://rb.gy/88iwdd
Reach out to me on LinkedIn. Let’s connect: https://rb.gy/m5ok2y
My Instagram: https://rb.gy/gmvuy9
My Twitter: https://twitter.com/Machine01776819