Repeated DNA Sequences - Medium

less than 1 minute read

Published:

Question

  • https://leetcode.com/problems/repeated-dna-sequences/

The DNA sequence is composed of a series of nucleotides abbreviated as ‘A’, ‘C’, ‘G’, and ‘T’.

For example, “ACGAATTCCG” is a DNA sequence. When studying DNA, it is useful to identify repeated sequences within the DNA.

Given a string s that represents a DNA sequence, return all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule. You may return the answer in any

Approach

  • First thought, store the 10-length strings as maps, return those for which the count > 1.

Solution

class Solution {
public:
    vector<string> findRepeatedDnaSequences(string s) {
        int n = s.size();
        map<string, int> m;
        vector<string> ans;
        if (n<=10) {
            return ans;
        }
        string t = "";
        for(int i=0; i<10; i++) {
           t = t + s[i]; 
        }
        m.insert(make_pair(t, 1));
        for(int i=10; i<n; i++) {
            t.erase(t.begin());
            t = t + s[i];
            if (m.find(t)==m.end()) {
                m.insert(make_pair(t, 1));
            } else if (m[t]==1) {
                m[t]=2;
                ans.push_back(t);
            }
        }
        return ans;
    }
};