Autonomy Software C++ 24.5.1
Welcome to the Autonomy Software repository of the Mars Rover Design Team (MRDT) at Missouri University of Science and Technology (Missouri S&T)! API reference contains the source code and other resources for the development of the autonomy software for our Mars rover. The Autonomy Software project aims to compete in the University Rover Challenge (URC) by demonstrating advanced autonomous capabilities and robust navigation algorithms.
Loading...
Searching...
No Matches
duckdb::HivePartitionedColumnData Class Reference
Inheritance diagram for duckdb::HivePartitionedColumnData:
Collaboration diagram for duckdb::HivePartitionedColumnData:

Public Member Functions

 HivePartitionedColumnData (ClientContext &context, vector< LogicalType > types, vector< idx_t > partition_by_cols, shared_ptr< GlobalHivePartitionState > global_state=nullptr)
 
void ComputePartitionIndices (PartitionedColumnDataAppendState &state, DataChunk &input) override
 
std::map< idx_t, const HivePartitionKey * > GetReverseMap ()
 Reverse lookup map to reconstruct keys from a partition id.
 
- Public Member Functions inherited from duckdb::PartitionedColumnData
unique_ptr< PartitionedColumnDataCreateShared ()
 
void InitializeAppendState (PartitionedColumnDataAppendState &state) const
 Initializes a local state for parallel partitioning that can be merged into this PartitionedColumnData.
 
void Append (PartitionedColumnDataAppendState &state, DataChunk &input)
 Appends a DataChunk to this PartitionedColumnData.
 
void FlushAppendState (PartitionedColumnDataAppendState &state)
 Flushes any remaining data in the append state into this PartitionedColumnData.
 
void Combine (PartitionedColumnData &other)
 Combine another PartitionedColumnData into this PartitionedColumnData.
 
vector< unique_ptr< ColumnDataCollection > > & GetPartitions ()
 Get the partitions in this PartitionedColumnData.
 
template<class TARGET >
TARGETCast ()
 
template<class TARGET >
const TARGETCast () const
 

Protected Member Functions

idx_t RegisterNewPartition (HivePartitionKey key, PartitionedColumnDataAppendState &state)
 Register a newly discovered partition.
 
void AddNewPartition (HivePartitionKey key, idx_t partition_id, PartitionedColumnDataAppendState &state)
 Add a new partition with the given partition id.
 
- Protected Member Functions inherited from duckdb::PartitionedColumnData
virtual idx_t BufferSize () const
 Size of the buffers in the append states for this type of partitioning (default 128)
 
virtual void InitializeAppendStateInternal (PartitionedColumnDataAppendState &state) const
 Initialize a PartitionedColumnDataAppendState for this type of partitioning (optional)
 
virtual idx_t MaxPartitionIndex () const
 Maximum partition index (optional)
 
 PartitionedColumnData (PartitionedColumnDataType type, ClientContext &context, vector< LogicalType > types)
 PartitionedColumnData can only be instantiated by derived classes.
 
 PartitionedColumnData (const PartitionedColumnData &other)
 
idx_t HalfBufferSize () const
 If the buffer is half full, we append to the partition.
 
void CreateAllocator ()
 Create a new shared allocator.
 
bool UseFixedSizeMap () const
 Whether to use fixed size map or regular map.
 
void BuildPartitionSel (PartitionedColumnDataAppendState &state, const idx_t append_count) const
 
template<bool fixed>
void AppendInternal (PartitionedColumnDataAppendState &state, DataChunk &input)
 Appends a DataChunk to this PartitionedColumnData.
 
unique_ptr< ColumnDataCollectionCreatePartitionCollection (idx_t partition_index) const
 Create a collection for a specific a partition.
 
unique_ptr< DataChunkCreatePartitionBuffer () const
 Create a DataChunk used for buffering appends to the partition.
 

Protected Attributes

shared_ptr< GlobalHivePartitionStateglobal_state
 Shared HivePartitionedColumnData should always have a global state to allow parallel key discovery.
 
hive_partition_map_t local_partition_map
 Thread-local copy of the partition map.
 
vector< idx_tgroup_by_columns
 The columns that make up the key.
 
Vector hashes_v
 Thread-local pre-allocated vector for hashes.
 
vector< HivePartitionKeykeys
 Thread-local pre-allocated HivePartitionKeys.
 
- Protected Attributes inherited from duckdb::PartitionedColumnData
PartitionedColumnDataType type
 
ClientContextcontext
 
vector< LogicalTypetypes
 
mutex lock
 
shared_ptr< PartitionColumnDataAllocatorsallocators
 
vector< unique_ptr< ColumnDataCollection > > partitions
 

Private Member Functions

void InitializeKeys ()
 

Additional Inherited Members

- Static Protected Member Functions inherited from duckdb::PartitionedColumnData
template<bool fixed>
static void BuildPartitionSel (PartitionedColumnDataAppendState &state, const idx_t append_count)
 

Constructor & Destructor Documentation

◆ HivePartitionedColumnData()

duckdb::HivePartitionedColumnData::HivePartitionedColumnData ( ClientContext context,
vector< LogicalType types,
vector< idx_t partition_by_cols,
shared_ptr< GlobalHivePartitionState global_state = nullptr 
)
78127 : PartitionedColumnData(PartitionedColumnDataType::HIVE, context, std::move(types)),
78128 global_state(std::move(global_state)), group_by_columns(std::move(partition_by_cols)),
78129 hashes_v(LogicalType::HASH) {
78130 InitializeKeys();
78132}
vector< idx_t > group_by_columns
The columns that make up the key.
Definition duckdb.hpp:51619
Vector hashes_v
Thread-local pre-allocated vector for hashes.
Definition duckdb.hpp:51621
shared_ptr< GlobalHivePartitionState > global_state
Shared HivePartitionedColumnData should always have a global state to allow parallel key discovery.
Definition duckdb.hpp:51615
void CreateAllocator()
Create a new shared allocator.
PartitionedColumnData(PartitionedColumnDataType type, ClientContext &context, vector< LogicalType > types)
PartitionedColumnData can only be instantiated by derived classes.
@ HIVE
Hive-style multi-field partitioning.

Member Function Documentation

◆ ComputePartitionIndices()

void duckdb::HivePartitionedColumnData::ComputePartitionIndices ( PartitionedColumnDataAppendState state,
DataChunk input 
)
overridevirtual

Compute the partition indices for this type of partitioning for the input DataChunk and store them in the partition_data of the local state. If this type creates partitions on the fly (for, e.g., hive), this function is also in charge of creating new partitions and mapping the input data to a partition index

Reimplemented from duckdb::PartitionedColumnData.

78090 {
78091 const auto count = input.size();
78092
78093 input.Hash(group_by_columns, hashes_v);
78094 hashes_v.Flatten(count);
78095
78096 for (idx_t col_idx = 0; col_idx < group_by_columns.size(); col_idx++) {
78097 auto &group_by_col = input.data[group_by_columns[col_idx]];
78098 GetHivePartitionValuesTypeSwitch(group_by_col, keys, col_idx, count);
78099 }
78100
78101 const auto hashes = FlatVector::GetData<hash_t>(hashes_v);
78102 const auto partition_indices = FlatVector::GetData<idx_t>(state.partition_indices);
78103 for (idx_t i = 0; i < count; i++) {
78104 auto &key = keys[i];
78105 key.hash = hashes[i];
78106 auto lookup = local_partition_map.find(key);
78107 if (lookup == local_partition_map.end()) {
78108 idx_t new_partition_id = RegisterNewPartition(key, state);
78109 partition_indices[i] = new_partition_id;
78110 } else {
78111 partition_indices[i] = lookup->second;
78112 }
78113 }
78114}
vector< HivePartitionKey > keys
Thread-local pre-allocated HivePartitionKeys.
Definition duckdb.hpp:51623
idx_t RegisterNewPartition(HivePartitionKey key, PartitionedColumnDataAppendState &state)
Register a newly discovered partition.
Definition duckdb.cpp:78149
hive_partition_map_t local_partition_map
Thread-local copy of the partition map.
Definition duckdb.hpp:51617
DUCKDB_API void Flatten(idx_t count)
Flatten the vector, removing any compression and turning it into a FLAT_VECTOR.
Here is the call graph for this function:

◆ GetReverseMap()

std::map< idx_t, const HivePartitionKey * > duckdb::HivePartitionedColumnData::GetReverseMap ( )

Reverse lookup map to reconstruct keys from a partition id.

78116 {
78117 std::map<idx_t, const HivePartitionKey *> ret;
78118 for (const auto &pair : local_partition_map) {
78119 ret[pair.second] = &(pair.first);
78120 }
78121 return ret;
78122}

◆ RegisterNewPartition()

idx_t duckdb::HivePartitionedColumnData::RegisterNewPartition ( HivePartitionKey  key,
PartitionedColumnDataAppendState state 
)
protected

Register a newly discovered partition.

78149 {
78150 idx_t partition_id;
78151 if (global_state) {
78152 // Synchronize Global state with our local state with the newly discovered partition
78153 unique_lock<mutex> lck_gstate(global_state->lock);
78154
78155 // Insert into global map, or return partition if already present
78156 auto res = global_state->partition_map.emplace(std::make_pair(key, global_state->partition_map.size()));
78157 partition_id = res.first->second;
78158 } else {
78159 partition_id = local_partition_map.size();
78160 }
78161 AddNewPartition(std::move(key), partition_id, state);
78162 return partition_id;
78163}
void AddNewPartition(HivePartitionKey key, idx_t partition_id, PartitionedColumnDataAppendState &state)
Add a new partition with the given partition id.
Definition duckdb.cpp:78134
Here is the call graph for this function:
Here is the caller graph for this function:

◆ AddNewPartition()

void duckdb::HivePartitionedColumnData::AddNewPartition ( HivePartitionKey  key,
idx_t  partition_id,
PartitionedColumnDataAppendState state 
)
protected

Add a new partition with the given partition id.

78135 {
78136 local_partition_map.emplace(std::move(key), partition_id);
78137
78138 if (state.partition_append_states.size() <= partition_id) {
78139 state.partition_append_states.resize(partition_id + 1);
78140 state.partition_buffers.resize(partition_id + 1);
78141 partitions.resize(partition_id + 1);
78142 }
78143 state.partition_append_states[partition_id] = make_uniq<ColumnDataAppendState>();
78144 state.partition_buffers[partition_id] = CreatePartitionBuffer();
78145 partitions[partition_id] = CreatePartitionCollection(0);
78146 partitions[partition_id]->InitializeAppend(*state.partition_append_states[partition_id]);
78147}
unique_ptr< DataChunk > CreatePartitionBuffer() const
Create a DataChunk used for buffering appends to the partition.
unique_ptr< ColumnDataCollection > CreatePartitionCollection(idx_t partition_index) const
Create a collection for a specific a partition.
Definition duckdb.hpp:47091
Here is the call graph for this function:
Here is the caller graph for this function:

◆ InitializeKeys()

void duckdb::HivePartitionedColumnData::InitializeKeys ( )
private
77976 {
77977 keys.resize(STANDARD_VECTOR_SIZE);
77978 for (idx_t i = 0; i < STANDARD_VECTOR_SIZE; i++) {
77979 keys[i].values.resize(group_by_columns.size());
77980 }
77981}

The documentation for this class was generated from the following files: