Autonomy Software C++ 24.5.1
Welcome to the Autonomy Software repository of the Mars Rover Design Team (MRDT) at Missouri University of Science and Technology (Missouri S&T)! API reference contains the source code and other resources for the development of the autonomy software for our Mars rover. The Autonomy Software project aims to compete in the University Rover Challenge (URC) by demonstrating advanced autonomous capabilities and robust navigation algorithms.
Loading...
Searching...
No Matches
duckdb::CSVReaderOptions Struct Reference
Collaboration diagram for duckdb::CSVReaderOptions:

Public Member Functions

 CSVReaderOptions (CSVOption< char > single_byte_delimiter, const CSVOption< string > &multi_byte_delimiter)
 
string GetUserDefinedParameters () const
 Returns a list of user-defined parameters in string format.
 
void SetCompression (const string &compression)
 
bool GetHeader () const
 
void SetHeader (bool has_header)
 
string GetEscape () const
 
void SetEscape (const string &escape)
 
idx_t GetSkipRows () const
 
void SetSkipRows (int64_t rows)
 
void SetQuote (const string &quote)
 
string GetQuote () const
 
void SetComment (const string &comment)
 
string GetComment () const
 
void SetDelimiter (const string &delimiter)
 
string GetDelimiter () const
 
bool IgnoreErrors () const
 If we can safely ignore errors (i.e., they are being ignored and not being stored in a rejects table)
 
string GetNewline () const
 
void SetNewline (const string &input)
 
bool GetRFC4180 () const
 
void SetRFC4180 (bool rfc4180)
 
CSVOption< charGetSingleByteDelimiter () const
 
CSVOption< string > GetMultiByteDelimiter () const
 
bool SetBaseOption (const string &loption, const Value &value, bool write_option=false)
 
void SetReadOption (const string &loption, const Value &value, vector< string > &expected_names)
 
void SetWriteOption (const string &loption, const Value &value)
 
void SetDateFormat (LogicalTypeId type, const string &format, bool read_format)
 
void ToNamedParameters (named_parameter_map_t &out) const
 
void FromNamedParameters (const named_parameter_map_t &in, ClientContext &context, MultiFileOptions &file_options)
 
void ParseOption (ClientContext &context, const string &key, const Value &val)
 
void Verify (MultiFileOptions &file_options)
 Verify options are not conflicting.
 
string ToString (const String &current_file_path) const
 
bool WasTypeManuallySet (idx_t i) const
 If the type for column with idx i was manually set.
 
string NewLineIdentifierToString () const
 

Public Attributes

DialectOptions dialect_options
 See struct above.
 
CSVOption< bool > ignore_errors = false
 Whether we should ignore InvalidInput errors.
 
CSVOption< bool > store_rejects = false
 Whether we store CSV Errors in the rejects table or not.
 
CSVOption< string > rejects_table_name = {"reject_errors"}
 Rejects table name (Name of the table the store rejects errors)
 
CSVOption< string > rejects_scan_name = {"reject_scans"}
 Rejects Scan name (Name of the table the store rejects scans)
 
idx_t rejects_limit = 0
 Rejects table entry limit (0 = no limit)
 
idx_t buffer_sample_size = static_cast<idx_t>(STANDARD_VECTOR_SIZE * 50)
 Number of samples to buffer.
 
vector< string > null_str = {""}
 Specifies the strings that represents a null value.
 
FileCompressionType compression = FileCompressionType::AUTO_DETECT
 
bool allow_quoted_nulls = true
 Option to convert quoted values to NULL values.
 
char comment = '\0'
 
char thousands_separator = '\0'
 Thousands separator option (to be able to accept "100,000.220" as a double or decimal.
 
case_insensitive_map_t< idx_tsql_types_per_column
 SQL Type list mapping of name to SQL type index in sql_type_list.
 
vector< LogicalTypesql_type_list
 User-defined SQL type list.
 
vector< string > name_list
 User-defined name list.
 
bool columns_set = false
 If the names and types were set by the columns parameter.
 
vector< LogicalTypeauto_type_candidates
 Types considered as candidates for auto-detection ordered by ascending specificity (~ from low to high)
 
string sniffer_user_mismatch_error
 In case the sniffer found a mismatch error from user defined types or dialect.
 
vector< bool > was_type_manually_set
 In case the sniffer found a mismatch error from user defined types or dialect.
 
CSVOption< idx_tmaximum_line_size = max_line_size_default
 
bool normalize_names = false
 Whether header names shall be normalized.
 
unordered_set< string > force_not_null_names
 True, if column with that index must skip null check.
 
vector< bool > force_not_null
 True, if column with that index must skip null check.
 
int64_t files_to_sniff = 10
 
idx_t sample_size_chunks = 20480 / sniff_size
 Number of sample chunks used in auto-detection.
 
bool all_varchar = false
 Consider all columns to be of type varchar.
 
bool auto_detect = true
 Whether to automatically detect dialect and datatypes.
 
string file_path
 The file path of the CSV file to read.
 
CSVOption< idx_tbuffer_size_option = CSVBuffer::ROWS_PER_BUFFER * max_line_size_default
 Buffer Size (Parallel Scan)
 
string decimal_separator = "."
 Decimal separator when reading as numeric.
 
bool null_padding = false
 Whether to pad rows that do not have enough columns with NULL values.
 
bool parallel = true
 If we should attempt to run parallel scanning over one file.
 
string encoding = "utf-8"
 By default, our encoding is always UTF-8.
 
map< string, string > user_defined_parameters
 User defined parameters.
 
vector< bool > force_quote
 True, if column with that index must be quoted.
 
string prefix
 Prefix/suffix/custom newline the entire file once (enables writing of files as JSON arrays)
 
string suffix
 
string write_newline
 
map< LogicalTypeId, Valuewrite_date_format = {{LogicalTypeId::DATE, Value()}, {LogicalTypeId::TIMESTAMP, Value()}}
 The date format to use for writing (if any is specified)
 
map< LogicalTypeId, bool > has_format = {{LogicalTypeId::DATE, false}, {LogicalTypeId::TIMESTAMP, false}}
 Whether a type format is specified.
 
bool multi_file_reader = false
 If this reader is a multifile reader.
 

Static Public Attributes

static constexpr idx_t max_line_size_default = 2000000
 
static constexpr idx_t sniff_size = 2048
 Result size of sniffing phases.
 

Constructor & Destructor Documentation

◆ CSVReaderOptions()

duckdb::CSVReaderOptions::CSVReaderOptions ( )
inline
51719 {
51720 }

Member Function Documentation

◆ SetBaseOption()

bool duckdb::CSVReaderOptions::SetBaseOption ( const string &  loption,
const Value value,
bool  write_option = false 
)

Set an option that is supported by both reading and writing functions, called by the SetReadOption and SetWriteOption methods

◆ SetReadOption()

void duckdb::CSVReaderOptions::SetReadOption ( const string &  loption,
const Value value,
vector< string > &  expected_names 
)

loption - lowercase string set - argument(s) to the option expected_names - names expected if the option is "columns"

◆ NewLineIdentifierToString()

string duckdb::CSVReaderOptions::NewLineIdentifierToString ( ) const
inline
51883 {
51884 switch (dialect_options.state_machine_options.new_line.GetValue()) {
51885 case NewLineIdentifier::SINGLE_N:
51886 return "\\n";
51887 case NewLineIdentifier::SINGLE_R:
51888 return "\\r";
51889 case NewLineIdentifier::CARRY_ON:
51890 return "\\r\\n";
51891 default:
51892 return "";
51893 }
51894 }
const T & GetValue() const
Returns CSV Option value.
Definition duckdb.hpp:46291
DialectOptions dialect_options
See struct above.
Definition duckdb.hpp:51726
CSVOption< NewLineIdentifier > new_line
New Line separator.
Definition duckdb.hpp:46409

Member Data Documentation

◆ rejects_table_name

CSVOption<string> duckdb::CSVReaderOptions::rejects_table_name = {"reject_errors"}

Rejects table name (Name of the table the store rejects errors)

51732{"reject_errors"};

◆ rejects_scan_name

CSVOption<string> duckdb::CSVReaderOptions::rejects_scan_name = {"reject_scans"}

Rejects Scan name (Name of the table the store rejects scans)

51734{"reject_scans"};

◆ null_str

vector<string> duckdb::CSVReaderOptions::null_str = {""}

Specifies the strings that represents a null value.

51740{""};

◆ compression

FileCompressionType duckdb::CSVReaderOptions::compression = FileCompressionType::AUTO_DETECT

Whether file is compressed or not, and if so which compression type AUTO_DETECT (default; infer from file extension)

◆ auto_type_candidates

vector<LogicalType> duckdb::CSVReaderOptions::auto_type_candidates
Initial value:
= {
LogicalType::VARCHAR, LogicalType::DOUBLE, LogicalType::BIGINT,
LogicalType::TIMESTAMP_TZ, LogicalType::TIMESTAMP, LogicalType::DATE,
LogicalType::TIME, LogicalType::BOOLEAN, LogicalType::SQLNULL}

Types considered as candidates for auto-detection ordered by ascending specificity (~ from low to high)

51763 {
51764 LogicalType::VARCHAR, LogicalType::DOUBLE, LogicalType::BIGINT,
51765 LogicalType::TIMESTAMP_TZ, LogicalType::TIMESTAMP, LogicalType::DATE,
51766 LogicalType::TIME, LogicalType::BOOLEAN, LogicalType::SQLNULL};

◆ max_line_size_default

constexpr idx_t duckdb::CSVReaderOptions::max_line_size_default = 2000000
staticconstexpr

Maximum CSV line size: specified because if we reach this amount, we likely have wrong delimiters (default: 2MB) note that this is the guaranteed line length that will succeed, longer lines may be accepted if slightly above

◆ files_to_sniff

int64_t duckdb::CSVReaderOptions::files_to_sniff = 10

In case this is a glob or list of multiple files, how many shall be used to sniff. -1 means all

◆ write_date_format

map<LogicalTypeId, Value> duckdb::CSVReaderOptions::write_date_format = {{LogicalTypeId::DATE, Value()}, {LogicalTypeId::TIMESTAMP, Value()}}

The date format to use for writing (if any is specified)

51827{{LogicalTypeId::DATE, Value()}, {LogicalTypeId::TIMESTAMP, Value()}};

◆ has_format

map<LogicalTypeId, bool> duckdb::CSVReaderOptions::has_format = {{LogicalTypeId::DATE, false}, {LogicalTypeId::TIMESTAMP, false}}

Whether a type format is specified.

51829{{LogicalTypeId::DATE, false}, {LogicalTypeId::TIMESTAMP, false}};

The documentation for this struct was generated from the following file: