azalea says

boost xpressive named captures in nested regexes

I need to port a lot of regular expressions from Python to C++. boost xpressive supports named captures, so I decided to give it a try.

boost xpressive has static regexes, which looks like the following:


#include <iostream>
#include <boost/xpressive/xpressive.hpp>
using namespace boost::xpressive;

int main()
{
    std::string str = "Hello123";
    smatch what;

    mark_tag word(1), number(2);
    sregex ex = (word=+alpha) >> (number=+_d);

    if(regex_search(str, what, ex))
    {
        std::cout << what[0] << std::endl;
        std::cout << what.size() << std::endl;
        std::cout << what[word] << std::endl;
        std::cout << what[number] << std::endl;

    }
    return 0;
}

The program outputs:

Hello123
3
Hello
123

static named captures is created with mark_tag word(1), and accessed by what[word] where what is a match_results<> iterator.

The above code works perfectly. However, when I try to use nested regex:


#include <iostream>
#include <boost/xpressive/xpressive.hpp>
using namespace boost::xpressive;

int main()
{
    std::string str = "Hello123";
    smatch what;

    mark_tag word(1), number(2);
    sregex WORD = (word=+alpha);
    sregex NUMBER = (number=+_d);
    
    sregex ex = WORD >> NUMBER;

    if(regex_search(str, what, ex))
    {
        std::cout << what[0] << std::endl;
        std::cout << what.size() << std::endl;
        std::cout << what[word] << std::endl;
        std::cout << what[number] << std::endl;
    }

    return 0;
}

The program now outputs:

Hello123
1


Note that named captures in the nested regex cannot be accessed by what[word] any more.

The reason is that “each invocation of a nested regex gets its own scope” (quote).

So far, I have not figured out a way to access the named captures in nested regexes. Any suggestions?

This discussion seems relavant, but I don’t understand the solution.

C++ regular expression · Tweet Edit