30. Substring with Concatenation of All Words
Description
You are given a string s
and an array of strings words
. All the strings of words
are of the same length.
A concatenated string is a string that exactly contains all the strings of any permutation of words
concatenated.
- For example, if
words = ["ab","cd","ef"]
, then"abcdef"
,"abefcd"
,"cdabef"
,"cdefab"
,"efabcd"
, and"efcdab"
are all concatenated strings."acdbef"
is not a concatenated string because it is not the concatenation of any permutation ofwords
.
Return an array of the starting indices of all the concatenated substrings in s
. You can return the answer in any order.
Example 1:
Input: s = "barfoothefoobarman", words = ["foo","bar"]
Output: [0,9]
Explanation:
The substring starting at 0 is "barfoo"
. It is the concatenation of ["bar","foo"]
which is a permutation of words
.
The substring starting at 9 is "foobar"
. It is the concatenation of ["foo","bar"]
which is a permutation of words
.
Example 2:
Input: s = "wordgoodgoodgoodbestword", words = ["word","good","best","word"]
Output: []
Explanation:
There is no concatenated substring.
Example 3:
Input: s = "barfoofoobarthefoobarman", words = ["bar","foo","the"]
Output: [6,9,12]
Explanation:
The substring starting at 6 is "foobarthe"
. It is the concatenation of ["foo","bar","the"]
.
The substring starting at 9 is "barthefoo"
. It is the concatenation of ["bar","the","foo"]
.
The substring starting at 12 is "thefoobar"
. It is the concatenation of ["the","foo","bar"]
.
Constraints:
1 <= s.length <= 104
1 <= words.length <= 5000
1 <= words[i].length <= 30
s
andwords[i]
consist of lowercase English letters.
Solutions
Solution 1: Hash Table + Sliding Window
We use a hash table $cnt$ to count the number of times each word appears in $words$, and use a hash table $cnt1$ to count the number of times each word appears in the current sliding window. We denote the length of the string $s$ as $m$, the number of words in the string array $words$ as $n$, and the length of each word as $k$.
We can enumerate the starting point $i$ of the sliding window, where $0 \lt i < k$. For each starting point, we maintain a sliding window with the left boundary as $l$, the right boundary as $r$, and the number of words in the sliding window as $t$. Additionally, we use a hash table $cnt1$ to count the number of times each word appears in the sliding window.
Each time, we extract the string $s[r:r+k]$. If $s[r:r+k]$ is not in the hash table $cnt$, it means that the words in the current sliding window are not valid. We update the left boundary $l$ to $r$, clear the hash table $cnt1$, and reset the word count $t$ to 0. If $s[r:r+k]$ is in the hash table $cnt$, it means that the words in the current sliding window are valid. We increase the word count $t$ by 1, and increase the count of $s[r:r+k]$ in the hash table $cnt1$ by 1. If $cnt1[s[r:r+k]]$ is greater than $cnt[s[r:r+k]]$, it means that $s[r:r+k]$ appears too many times in the current sliding window. We need to move the left boundary $l$ to the right until $cnt1[s[r:r+k]] = cnt[s[r:r+k]]$. If $t = n$, it means that the words in the current sliding window are exactly valid, and we add the left boundary $l$ to the answer array.
The time complexity is $O(m \times k)$, and the space complexity is $O(n \times k)$. Here, $m$ and $n$ are the lengths of the string $s$ and the string array $words$ respectively, and $k$ is the length of the words in the string array $words$.
Python3
class Solution:
def findSubstring(self, s: str, words: List[str]) -> List[int]:
cnt = Counter(words)
m, n = len(s), len(words)
k = len(words[0])
ans = []
for i in range(k):
l = r = i
cnt1 = Counter()
while r + k <= m:
t = s[r : r + k]
r += k
if cnt[t] == 0:
l = r
cnt1.clear()
continue
cnt1[t] += 1
while cnt1[t] > cnt[t]:
rem = s[l : l + k]
l += k
cnt1[rem] -= 1
if r - l == n * k:
ans.append(l)
return ans
Java
class Solution {
public List<Integer> findSubstring(String s, String[] words) {
Map<String, Integer> cnt = new HashMap<>();
for (var w : words) {
cnt.merge(w, 1, Integer::sum);
}
List<Integer> ans = new ArrayList<>();
int m = s.length(), n = words.length, k = words[0].length();
for (int i = 0; i < k; ++i) {
int l = i, r = i;
Map<String, Integer> cnt1 = new HashMap<>();
while (r + k <= m) {
var t = s.substring(r, r + k);
r += k;
if (!cnt.containsKey(t)) {
cnt1.clear();
l = r;
continue;
}
cnt1.merge(t, 1, Integer::sum);
while (cnt1.get(t) > cnt.get(t)) {
String w = s.substring(l, l + k);
if (cnt1.merge(w, -1, Integer::sum) == 0) {
cnt1.remove(w);
}
l += k;
}
if (r - l == n * k) {
ans.add(l);
}
}
}
return ans;
}
}
C++
class Solution {
public:
vector<int> findSubstring(string s, vector<string>& words) {
unordered_map<string, int> cnt;
for (const auto& w : words) {
cnt[w]++;
}
vector<int> ans;
int m = s.length(), n = words.size(), k = words[0].length();
for (int i = 0; i < k; ++i) {
int l = i, r = i;
unordered_map<string, int> cnt1;
while (r + k <= m) {
string t = s.substr(r, k);
r += k;
if (!cnt.contains(t)) {
cnt1.clear();
l = r;
continue;
}
cnt1[t]++;
while (cnt1[t] > cnt[t]) {
string w = s.substr(l, k);
if (--cnt1[w] == 0) {
cnt1.erase(w);
}
l += k;
}
if (r - l == n * k) {
ans.push_back(l);
}
}
}
return ans;
}
};
Go
func findSubstring(s string, words []string) (ans []int) {
cnt := make(map[string]int)
for _, w := range words {
cnt[w]++
}
m, n, k := len(s), len(words), len(words[0])
for i := 0; i < k; i++ {
l, r := i, i
cnt1 := make(map[string]int)
for r+k <= m {
t := s[r : r+k]
r += k
if _, exists := cnt[t]; !exists {
cnt1 = make(map[string]int)
l = r
continue
}
cnt1[t]++
for cnt1[t] > cnt[t] {
w := s[l : l+k]
cnt1[w]--
if cnt1[w] == 0 {
delete(cnt1, w)
}
l += k
}
if r-l == n*k {
ans = append(ans, l)
}
}
}
return
}
TypeScript
function findSubstring(s: string, words: string[]): number[] {
const cnt: Map<string, number> = new Map();
for (const w of words) {
cnt.set(w, (cnt.get(w) || 0) + 1);
}
const ans: number[] = [];
const [m, n, k] = [s.length, words.length, words[0].length];
for (let i = 0; i < k; i++) {
let [l, r] = [i, i];
const cnt1: Map<string, number> = new Map();
while (r + k <= m) {
const t = s.substring(r, r + k);
r += k;
if (!cnt.has(t)) {
cnt1.clear();
l = r;
continue;
}
cnt1.set(t, (cnt1.get(t) || 0) + 1);
while (cnt1.get(t)! > cnt.get(t)!) {
const w = s.substring(l, l + k);
cnt1.set(w, cnt1.get(w)! - 1);
if (cnt1.get(w) === 0) {
cnt1.delete(w);
}
l += k;
}
if (r - l === n * k) {
ans.push(l);
}
}
}
return ans;
}
C#
public class Solution {
public IList<int> FindSubstring(string s, string[] words) {
var cnt = new Dictionary<string, int>();
foreach (var w in words) {
if (cnt.ContainsKey(w)) {
cnt[w]++;
} else {
cnt[w] = 1;
}
}
var ans = new List<int>();
int m = s.Length, n = words.Length, k = words[0].Length;
for (int i = 0; i < k; ++i) {
int l = i, r = i;
var cnt1 = new Dictionary<string, int>();
while (r + k <= m) {
var t = s.Substring(r, k);
r += k;
if (!cnt.ContainsKey(t)) {
cnt1.Clear();
l = r;
continue;
}
if (cnt1.ContainsKey(t)) {
cnt1[t]++;
} else {
cnt1[t] = 1;
}
while (cnt1[t] > cnt[t]) {
var w = s.Substring(l, k);
cnt1[w]--;
if (cnt1[w] == 0) {
cnt1.Remove(w);
}
l += k;
}
if (r - l == n * k) {
ans.Add(l);
}
}
}
return ans;
}
}
PHP
class Solution {
/**
* @param String $s
* @param String[] $words
* @return Integer[]
*/
function findSubstring($s, $words) {
$cnt = [];
foreach ($words as $w) {
if (isset($cnt[$w])) {
$cnt[$w]++;
} else {
$cnt[$w] = 1;
}
}
$ans = [];
$m = strlen($s);
$n = count($words);
$k = strlen($words[0]);
for ($i = 0; $i < $k; $i++) {
$l = $i;
$r = $i;
$cnt1 = [];
while ($r + $k <= $m) {
$t = substr($s, $r, $k);
$r += $k;
if (!isset($cnt[$t])) {
$cnt1 = [];
$l = $r;
continue;
}
if (isset($cnt1[$t])) {
$cnt1[$t]++;
} else {
$cnt1[$t] = 1;
}
while ($cnt1[$t] > $cnt[$t]) {
$w = substr($s, $l, $k);
$cnt1[$w]--;
if ($cnt1[$w] == 0) {
unset($cnt1[$w]);
}
$l += $k;
}
if ($r - $l == $n * $k) {
$ans[] = $l;
}
}
}
return $ans;
}
}