csv¶
For CSV support, include the header <rfl/csv.hpp>
and link to the Apache Arrow library.
Furthermore, when compiling reflect-cpp, you need to pass -DREFLECTCPP_CSV=ON
to cmake.
CSV is a tabular text format. Like other tabular formats in reflect-cpp, CSV is designed for collections of flat records and has limitations for nested or variant types.
Reading and writing¶
Suppose you have a struct like this:
struct Person {
std::string first_name;
std::string last_name = "Simpson";
rfl::Timestamp<"%Y-%m-%d"> birthday;
unsigned int age;
rfl::Email email;
};
Important: CSV is a tabular format that requires collections of records. You cannot serialize individual structs - you must use containers like std::vector<Person>
, std::deque<Person>
, etc.
Write a collection to a string (CSV bytes) like this:
const auto people = std::vector<Person>{
Person{.first_name = "Bart", .birthday = "1987-04-19", .age = 10, .email = "bart@simpson.com"},
Person{.first_name = "Lisa", .birthday = "1987-04-19", .age = 8, .email = "lisa@simpson.com"}
};
const std::string csv_text = rfl::csv::write(people);
Parse from a string or bytes view:
const rfl::Result<std::vector<Person>> result = rfl::csv::read<std::vector<Person>>(csv_text);
Settings¶
CSV behavior can be configured using rfl::csv::Settings
:
const auto settings = rfl::csv::Settings{}
.with_delimiter(';')
.with_quoting(true)
.with_quote_char('"')
.with_null_string("n/a")
.with_double_quote(true)
.with_escaping(false)
.with_escape_char('\\')
.with_newlines_in_values(false)
.with_ignore_empty_lines(true)
.with_batch_size(1024);
const std::string csv_text = rfl::csv::write(people, settings);
Key options:
- batch_size
- Maximum number of rows processed per batch (performance tuning)
- delimiter
- Field delimiter character
- quoting
- Whether to use quoting when writing
- quote_char
- Quote character used when reading
- null_string
- String representation for null values
- double_quote
- Whether a quote inside a value is double-quoted (reading)
- escaping
- Whether escaping is used (reading)
- escape_char
- Escape character (reading)
- newlines_in_values
- Whether CR/LF are allowed inside values (reading)
- ignore_empty_lines
- Whether empty lines are ignored (reading)
Loading and saving¶
You can load from and save to disk:
const rfl::Result<std::vector<Person>> result = rfl::csv::load<std::vector<Person>>("/path/to/file.csv");
const auto people = std::vector<Person>{...};
rfl::csv::save("/path/to/file.csv", people);
With custom settings:
const auto settings = rfl::csv::Settings{}.with_delimiter(';');
rfl::csv::save("/path/to/file.csv", people, settings);
Reading from and writing into streams¶
You can read from any std::istream
and write to any std::ostream
:
const rfl::Result<std::vector<Person>> result = rfl::csv::read<std::vector<Person>>(my_istream);
const auto people = std::vector<Person>{...};
rfl::csv::write(people, my_ostream);
With custom settings:
const auto settings = rfl::csv::Settings{}.with_delimiter(';');
rfl::csv::write(people, my_ostream, settings);
Field name transformations¶
Like other formats, CSV supports field name transformations via processors, e.g. SnakeCaseToCamelCase
:
const auto people = std::vector<Person>{...};
const auto result = rfl::csv::read<std::vector<Person>, rfl::SnakeCaseToCamelCase>(csv_text);
Enums and validation¶
CSV supports enums and validated types. Enums are written/read as strings:
enum class FirstName { Bart, Lisa, Maggie, Homer };
struct Person {
rfl::Rename<"firstName", FirstName> first_name;
rfl::Rename<"lastName", std::string> last_name;
rfl::Timestamp<"%Y-%m-%d"> birthday;
rfl::Validator<unsigned int, rfl::Minimum<0>, rfl::Maximum<130>> age;
rfl::Email email;
};
Limitations of tabular formats¶
CSV, like other tabular formats, has limitations compared to hierarchical formats such as JSON or XML:
Collections requirement¶
You must serialize collections, not individual objects:
std::vector<Person> people = {...}; // ✅ Correct
Person person = {...}; // ❌ Wrong - must be in a container
No nested objects¶
Each field must be a primitive type, enum, or a simple validated type. Nested objects are not automatically flattened:
// This would NOT work as expected - nested objects are not automatically flattened
struct Address {
std::string street;
std::string city;
};
struct Person {
std::string first_name;
std::string last_name;
Address address; // ❌ Will cause compilation errors for CSV
};
Using rfl::Flatten for nested objects¶
If you need to include nested objects, use rfl::Flatten
to explicitly flatten them:
struct Address {
std::string street;
std::string city;
};
struct Person {
std::string first_name;
std::string last_name;
rfl::Flatten<Address> address; // ✅ This will flatten the Address fields
};
// The resulting CSV will have columns: first_name, last_name, street, city
No variant types¶
Variant types like std::variant
, rfl::Variant
, or rfl::TaggedUnion
cannot be serialized to CSV as separate columns:
// ❌ This will NOT work
struct Person {
std::string first_name;
std::variant<std::string, int> status; // Variant - not supported
rfl::Variant<std::string, int> type; // rfl::Variant - not supported
rfl::TaggedUnion<"type", std::string, int> category; // TaggedUnion - not supported
};
No arrays (except bytestrings)¶
CSV output here does not support arrays (lists) of values in a single column. The only array-like field supported is binary data represented as bytestrings:
// ❌ This will NOT work
struct Person {
std::string first_name;
std::vector<std::string> hobbies; // Array of strings - not supported
std::vector<int> scores; // Array of integers - not supported
std::vector<Address> addresses; // Array of objects - not supported
};
// ✅ This works
struct Blob {
std::vector<char> binary_data; // Binary data supported as bytestring
};
Use cases¶
CSV is ideal for: - Data exchange and interoperability - Simple, flat data structures with consistent types - Human-readable datasets
CSV is less suitable for: - Complex nested data structures - Data with arrays or variant types - Strict schemas with evolving types - Very large datasets where binary columnar formats are preferred