dimanche 13 août 2017

using stl to run length encode a string using std::adjacent_find

I am trying to perform run length compression on a string for a special protocol that I am using. Runs are considered efficient when the run size or a particular character in the string is >=3. Can someone help me to achieve this. I am pretty sure this is possible with the standard library's std::adjacent_find with a combination of std::not_equal_to<> as the binary predicate to search for run boundaries and probably using std::equal_to<> once I find a boundary. Here is what I have so far but I am having trouble with the details.

Given the following input text string containing runs or spaces and other characters (in this case runs of the letter 's':

"   thisssss   is a   test  "

If I can convert this text into a vector of the following text elements, I can easily run length encode the results:

"   ", "thi", "sssss", "   ", "is", " ", "a", "test", "  "

The code I have is as follows, I am having problems in the inner loop where I am trying to :

int main()
{
    // I want to convert this string containing adjacent runs of characters
    std::string testString("   thisssss   is a   test  ");

    // to the following 
    std::vector<std::string> idealResults = {
        "   ", "thi", "sssss",
        "   ", "is", " ", "a",
        "   ", "test", "  "
    };

    std::vector<std::string> tokenizedStrings;
    auto adjIter = testString.begin();
    auto lastIter = adjIter;
    while ((adjIter = std::adjacent_find(adjIter, testString.end(), std::not_equal_to<>())) != testString.end()) {
        tokenizedStrings.push_back(std::string(lastIter, adjIter + 1));
        for (auto next = std::next(adjIter); next != testString.end() &&
            std::not_equal_to<>()(*adjIter, *next); ++next) {
            // Need help here to 
        }
        // lastIter points to the end of the run
        lastIter = adjIter + 1;
    }
}

Aucun commentaire:

Enregistrer un commentaire