Gandiva Expression Compiler¶
TreeExprBuilder Class¶
-
class
gandiva
::
TreeExprBuilder
¶ Tree Builder for a nested expression.
Public Static Functions
-
static NodePtr
MakeLiteral
(bool value)¶ create a node on a literal.
-
static NodePtr
MakeNull
(DataTypePtr data_type)¶ create a node on a null literal.
returns null if data_type is null or if it’s not a supported datatype.
-
static NodePtr
MakeField
(FieldPtr field)¶ create a node on arrow field.
returns null if input is null.
-
static NodePtr
MakeFunction
(const std::string &name, const NodeVector ¶ms, DataTypePtr return_type)¶ create a node with a function.
returns null if return_type is null
-
static NodePtr
MakeIf
(NodePtr condition, NodePtr then_node, NodePtr else_node, DataTypePtr result_type)¶ create a node with an if-else expression.
returns null if any of the inputs is null.
-
static NodePtr
MakeAnd
(const NodeVector &children)¶ create a node with a boolean AND expression.
-
static NodePtr
MakeOr
(const NodeVector &children)¶ create a node with a boolean OR expression.
-
static ExpressionPtr
MakeExpression
(NodePtr root_node, FieldPtr result_field)¶ create an expression with the specified root_node, and the result written to result_field.
returns null if the result_field is null.
-
static ExpressionPtr
MakeExpression
(const std::string &function, const FieldVector &in_fields, FieldPtr out_field)¶ convenience function for simple function expressions.
returns null if the out_field is null.
-
static ConditionPtr
MakeCondition
(NodePtr root_node)¶ create a condition with the specified root_node
-
static ConditionPtr
MakeCondition
(const std::string &function, const FieldVector &in_fields)¶ convenience function for simple function conditions.
-
static NodePtr
MakeInExpressionInt32
(NodePtr node, const std::unordered_set<int32_t> &constants)¶ creates an in expression
-
static NodePtr
MakeInExpressionFloat
(NodePtr node, const std::unordered_set<float> &constants)¶ creates an in expression for float
-
static NodePtr
MakeInExpressionDouble
(NodePtr node, const std::unordered_set<double> &constants)¶ creates an in expression for double
-
static NodePtr
MakeInExpressionDate32
(NodePtr node, const std::unordered_set<int32_t> &constants)¶ Date as s/millis since epoch.
-
static NodePtr
MakeInExpressionDate64
(NodePtr node, const std::unordered_set<int64_t> &constants)¶ Date as millis/us/ns since epoch.
-
static NodePtr
MakeInExpressionTime32
(NodePtr node, const std::unordered_set<int32_t> &constants)¶ Time as s/millis of day.
-
static NodePtr
MakeInExpressionTime64
(NodePtr node, const std::unordered_set<int64_t> &constants)¶ Time as millis/us/ns of day.
-
static NodePtr
MakeInExpressionTimeStamp
(NodePtr node, const std::unordered_set<int64_t> &constants)¶ Timestamp as millis since epoch.
-
static NodePtr
-
class
gandiva
::
Node
¶ Represents a node in the expression tree.
Validity and value are in a joined state.
Subclassed by gandiva::BooleanNode, gandiva::FieldNode, gandiva::FunctionNode, gandiva::IfNode, gandiva::InExpressionNode< Type >, gandiva::InExpressionNode< gandiva::DecimalScalar128 >, gandiva::LiteralNode
Public Functions
-
virtual Status
Accept
(NodeVisitor &visitor) const = 0¶ Derived classes should simply invoke the Visit api of the visitor.
-
virtual Status
-
class
Expression
¶ An expression tree with a root node, and a result field.
Subclassed by gandiva::Condition
-
class
Condition
: public gandiva::Expression¶ A condition expression.
Function registry¶
Configuration¶
-
class
Configuration
¶ runtime config for gandiva
It contains elements to customize gandiva execution at run time.
-
class
ConfigurationBuilder
¶ configuration builder for gandiva
Provides a default configuration and convenience methods to override specific values and build a custom instance
Projector¶
-
class
gandiva
::
Projector
¶ projection using expressions.
A projector is built for a specific schema and vector of expressions. Once the projector is built, it can be used to evaluate many row batches.
Public Functions
-
Status
Evaluate
(const arrow::RecordBatch &batch, arrow::MemoryPool *pool, arrow::ArrayVector *output)¶ Evaluate the specified record batch, and return the allocated and populated output arrays.
The output arrays will be allocated from the memory pool ‘pool’, and added to the vector ‘output’.
- Parameters
batch – [in] the record batch. schema should be the same as the one in ‘Make’
pool – [in] memory pool used to allocate output arrays (if required).
output – [out] the vector of allocated/populated arrays.
-
Status
Evaluate
(const arrow::RecordBatch &batch, const ArrayDataVector &output)¶ Evaluate the specified record batch, and populate the output arrays.
The output arrays of sufficient capacity must be allocated by the caller.
- Parameters
batch – [in] the record batch. schema should be the same as the one in ‘Make’
output – [inout] vector of arrays, the arrays are allocated by the caller and populated by Evaluate.
-
Status
Evaluate
(const arrow::RecordBatch &batch, const SelectionVector *selection_vector, arrow::MemoryPool *pool, arrow::ArrayVector *output)¶ Evaluate the specified record batch, and return the allocated and populated output arrays.
The output arrays will be allocated from the memory pool ‘pool’, and added to the vector ‘output’.
- Parameters
batch – [in] the record batch. schema should be the same as the one in ‘Make’
selection_vector – [in] selection vector which has filtered row positions.
pool – [in] memory pool used to allocate output arrays (if required).
output – [out] the vector of allocated/populated arrays.
-
Status
Evaluate
(const arrow::RecordBatch &batch, const SelectionVector *selection_vector, const ArrayDataVector &output)¶ Evaluate the specified record batch, and populate the output arrays at the filtered positions.
The output arrays of sufficient capacity must be allocated by the caller.
- Parameters
batch – [in] the record batch. schema should be the same as the one in ‘Make’
selection_vector – [in] selection vector which has the filtered row positions
output – [inout] vector of arrays, the arrays are allocated by the caller and populated by Evaluate.
Public Static Functions
Build a default projector for the given schema to evaluate the vector of expressions.
- Parameters
schema – [in] schema for the record batches, and the expressions.
exprs – [in] vector of expressions.
projector – [out] the returned projector object
Build a projector for the given schema to evaluate the vector of expressions.
Customize the projector with runtime configuration.
- Parameters
schema – [in] schema for the record batches, and the expressions.
exprs – [in] vector of expressions.
configuration – [in] run time configuration.
projector – [out] the returned projector object
Build a projector for the given schema to evaluate the vector of expressions.
Customize the projector with runtime configuration.
- Parameters
schema – [in] schema for the record batches, and the expressions.
exprs – [in] vector of expressions.
selection_vector_mode – [in] mode of selection vector
configuration – [in] run time configuration.
projector – [out] the returned projector object
-
Status
Filter¶
-
class
gandiva
::
Filter
¶ filter records based on a condition.
A filter is built for a specific schema and condition. Once the filter is built, it can be used to evaluate many row batches.
Public Functions
Evaluate the specified record batch, and populate output selection vector.
- Parameters
batch – [in] the record batch. schema should be the same as the one in ‘Make’
out_selection – [inout] the selection array with indices of rows that match the condition.
Public Static Functions
Build a filter for the given schema and condition, with the default configuration.
- Parameters
schema – [in] schema for the record batches, and the condition.
condition – [in] filter condition.
filter – [out] the returned filter object
Build a filter for the given schema and condition.
Customize the filter with runtime configuration.
- Parameters
schema – [in] schema for the record batches, and the condition.
condition – [in] filter conditions.
config – [in] run time configuration.
filter – [out] the returned filter object
-
class
gandiva
::
SelectionVector
¶ Selection Vector : vector of indices in a row-batch for a selection, backed by an arrow-array.
Subclassed by gandiva::SelectionVectorImpl< C_TYPE, A_TYPE, mode >
Public Functions
-
virtual uint64_t
GetIndex
(int64_t index) const = 0¶ Get the value at a given index.
-
virtual void
SetIndex
(int64_t index, uint64_t value) = 0¶ Set the value at a given index.
-
virtual int64_t
GetMaxSlots
() const = 0¶ The maximum slots (capacity) of the selection vector.
-
virtual int64_t
GetNumSlots
() const = 0¶ The number of slots (size) of the selection vector.
-
virtual void
SetNumSlots
(int64_t num_slots) = 0¶ Set the number of slots in the selection vector.
-
virtual ArrayPtr
ToArray
() const = 0¶ Convert to arrow-array.
-
virtual Mode
GetMode
() const = 0¶ Mode of SelectionVector.
-
Status
PopulateFromBitMap
(const uint8_t *bitmap, int64_t bitmap_size, int64_t max_bitmap_index)¶ populate selection vector for all the set bits in the bitmap.
- Parameters
bitmap – [in] the bitmap
bitmap_size – [in] size of the bitmap in bytes
max_bitmap_index – [in] max valid index in bitmap (can be lesser than capacity in the bitmap, due to alignment/padding).
Public Static Functions
make selection vector with int16 type records.
- Parameters
max_slots – [in] max number of slots
buffer – [in] buffer sized to accommodate max_slots
selection_vector – [out] selection vector backed by ‘buffer’
- Parameters
max_slots – [in] max number of slots
pool – [in] memory pool to allocate buffer
selection_vector – [out] selection vector backed by a buffer allocated from the pool.
creates a selection vector with pre populated buffer.
- Parameters
num_slots – [in] size of the selection vector
buffer – [in] pre-populated buffer
selection_vector – [out] selection vector backed by ‘buffer’
make selection vector with int32 type records.
- Parameters
max_slots – [in] max number of slots
buffer – [in] buffer sized to accommodate max_slots
selection_vector – [out] selection vector backed by ‘buffer’
make selection vector with int32 type records.
- Parameters
max_slots – [in] max number of slots
pool – [in] memory pool to allocate buffer
selection_vector – [out] selection vector backed by a buffer allocated from the pool.
creates a selection vector with pre populated buffer.
- Parameters
num_slots – [in] size of the selection vector
buffer – [in] pre-populated buffer
selection_vector – [out] selection vector backed by ‘buffer’
make selection vector with int64 type records.
- Parameters
max_slots – [in] max number of slots
buffer – [in] buffer sized to accommodate max_slots
selection_vector – [out] selection vector backed by ‘buffer’
make selection vector with int64 type records.
- Parameters
max_slots – [in] max number of slots
pool – [in] memory pool to allocate buffer
selection_vector – [out] selection vector backed by a buffer allocated from the pool.
-
virtual uint64_t