Quick and robust C++ CSV reader with boost

This is quick and simple CSV reader based on Boost regular expression token iterator. Parser splits the input with a regular expressions and returns the result as a collection of vectors of strings.
Regular expression handles neatly lot of the complicated edge cases such as empty columns, quoted text, etc..

Parser code

#include <boost/regex.hpp>

// used to split the file in lines
const boost::regex linesregx("\\r\\n|\\n\\r|\\n|\\r");

// used to split each line to tokens, assuming ',' as column separator
const boost::regex fieldsregx(",(?=(?:[^\"]*\"[^\"]*\")*(?![^\"]*\"))");

typedef std::vector<std::string> Row;

std::vector<Row> parse(const char* data, unsigned int length)
    std::vector<Row> result;

    // iterator splits data to lines
    boost::cregex_token_iterator li(data, data + length, linesregx, -1);
    boost::cregex_token_iterator end;

    while (li != end) {
        std::string line = li->str();

        // Split line to tokens
        boost::sregex_token_iterator ti(line.begin(), line.end(), fieldsregx, -1);
        boost::sregex_token_iterator end2;

        std::vector<std::string> row;
        while (ti != end2) {
            std::string token = ti->str();
        if (line.back() == ',') {
            // last character was a separator
    return result;


CSV data with common problem cases, such as empty quotes, commas inside quotes and empty last column.

3,a b,5
7,x,long story no commas
8,y,"some, commas, here,"

Read and parse the CSV data above and output the parsed result

int main(int argc, char*argv[])
	// read example file
	std::ifstream infile;
	char buffer[1024];
	infile.read(buffer, sizeof(buffer));
	buffer[infile.tellg()] = '\0';

	// parse file
	std::vector<Row> result  = parse(buffer, strlen(buffer));

	// print out result
	for(size_t r=0; r < result.size(); r++) {
		Row& row = result[r];
		for(size_t c=0; c < row.size() - 1; c++) {
			std::cout << row[c] << "\t";
		std::cout << row.back() << std::endl;


$ ./reader
a      	b      	c
1      	"cat"  	3
",2"   	dog    	4
3      	a b    	5
4      	empty
5      	        empty
6      	""      empty2
7      	x      	long story no commas
8      	y      	"some, commas, here,"

See full example code in Github: https://github.com/tikonen/blog/tree/master/boostcsvreader