samedi 25 février 2017

Count unique words in a string in C++

I want to count how many unique words are in string 's' where punctuations and newline character (\n) separates each word. So far I've used the logical or operator to check how many wordSeparators are in the string, and added 1 to the result to get the number of words in string s.

My current code returns 12 as the number of word. Since 'ab', 'AB', 'aB', 'Ab' (and same for 'zzzz') are all same and not unique, how can I ignore the variants of a word? I followed the link: http://ift.tt/1ycarDX, but the reference counts unique item in a vector. But, I am using string and not vector.

Here is my code:

#include <iostream>
#include <string>
using namespace std;

bool isWordSeparator(char & c) {

    return c == ' ' || c == '-' || c == '\n' || c == '?' || c == '.' || c == ','
    || c == '?' || c == '!' || c == ':' || c == ';';
}

int countWords(string s) {
    int wordCount = 0;

    if (s.empty()) {
    return 0;
    }

    for (int x = 0; x < s.length(); x++) {
    if (isWordSeparator(s.at(x))) {
            wordCount++;

    return wordCount+1;

int main() {
    string s = "ab\nAb!aB?AB:ab.AB;ab\nAB\nZZZZ zzzz Zzzz\nzzzz";
    int number_of_words = countWords(s);

    cout << "Number of Words: " << number_of_words  << endl;

    return 0;

}

Aucun commentaire:

Enregistrer un commentaire