samedi 29 août 2015

Boost Spirit: Sub-grammar appending to string?

I am toying with Boost.Spirit. As part of a larger work I am trying to construct a grammar for parsing C/C++ style string literals. I encountered a problem:

How do I create a sub-grammar that appends a std::string() result to the calling grammar's std::string() attribute (instead of just a char?

Here is my code, which is working so far. (Actually I already got much more than that, including stuff like '\n' etc., but I cut it down to the essentials.)

#define BOOST_SPIRIT_UNICODE

#include <string>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix_operator.hpp>

using namespace boost;
using namespace boost::spirit;
using namespace boost::spirit::qi;

template < typename Iterator >
struct EscapedUnicode : grammar< Iterator, char() > // <-- should be std::string
{
    EscapedUnicode() : EscapedUnicode::base_type( escaped_unicode )
    {
        escaped_unicode %= "\\" > ( ( "u" >> uint_parser< char, 16, 4, 4 >() )
                                  | ( "U" >> uint_parser< char, 16, 8, 8 >() ) );
    }

    rule< Iterator, char() > escaped_unicode;  // <-- should be std::string
};

template < typename Iterator >
struct QuotedString : grammar< Iterator, std::string() >
{
    QuotedString() : QuotedString::base_type( quoted_string )
    {
        quoted_string %= '"' >> *( escaped_unicode | ( char_ - ( '"' | eol ) ) ) >> '"';
    }

    EscapedUnicode< Iterator > escaped_unicode;
    rule< Iterator, std::string() > quoted_string;
};

int main()
{
    std::string input = "\"foo\u0041\"";
    typedef std::string::const_iterator iterator_type;
    QuotedString< iterator_type > qs;
    std::string result;
    bool r = parse( input.cbegin(), input.cend(), qs, result );
    std::cout << result << std::endl;
}

This prints fooA -- the QuotedString grammar calls the EscapedUnicode grammar, which results in a char being added to the std::string attribute of QuotedString (the A, 0x41).

But of course I would need to generate a sequence of chars (bytes) for anything beyond 0x7f. EscapedUnicode would neet to produce a std::string, which would have to be appended to the string generated by QuotedString.

And that is where I've met a roadblock. I do not understand the things Boost.Spirit does in concert with Boost.Phoenix, and any attempts I have made resulted in lengthy and pretty much undecipherable template-related compiler errors.

So, how can I do this? The answer need not actually do the proper Unicode conversion; it's the std::string issue I need a solution for.

Aucun commentaire:

Enregistrer un commentaire