Package org.apache.orc
Class Reader.Options
java.lang.Object
org.apache.orc.Reader.Options
- All Implemented Interfaces:
Cloneable
- Enclosing interface:
- Reader
Options for creating a RecordReader.
- Since:
- 1.1.0
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionboolean
allowPluginFilters
(boolean allowPluginFilters) allowSARGToFilter
(boolean allowSARGToFilter) Set allowSARGToFilter.clone()
dataReader
(DataReader value) Set dataReader.forcePositionalEvolution
(boolean value) Set whether to force schema evolution to be positional instead of based on the column names.String[]
boolean
boolean[]
boolean
boolean
long
long
long
int
String[]
int
org.apache.hadoop.hive.ql.io.sarg.SearchArgument
boolean
include
(boolean[] include) Set the list of columns to read.includeAcidColumns
(boolean includeAcidColumns) true
if acid metadata columns should be decoded otherwise they will be set tonull
.boolean
Get allowSARGToFilter value.isSchemaEvolutionCaseAware
(boolean value) Set boolean flag to determine if the comparison of field names in schema evolution is case sensitiveint
minSeekSize
(int minSeekSize) double
minSeekSizeTolerance
(double value) pluginAllowListFilters
(String... allowLists) positionalEvolutionLevel
(int value) Set number of levels to force schema evolution to be positional instead of based on the column names.range
(long offset, long length) Set the range of bytes to readrowBatchSize
(int value) schema
(TypeDescription schema) Set the schema on read type description.searchArgument
(org.apache.hadoop.hive.ql.io.sarg.SearchArgument sarg, String[] columnNames) Set search argument for predicate push down.setRowFilter
(String[] filterColumnNames, Consumer<OrcFilterContext> filterCallback) Set a row level filter.skipCorruptRecords
(boolean value) Set whether to skip corrupt records.tolerateMissingSchema
(boolean value) Set whether to make a best effort to tolerate schema evolution for files which do not have an embedded schema because they were written with a' pre-HIVE-4243 writer.toString()
boolean
useSelected
(boolean newValue) useZeroCopy
(boolean value) Set whether to use zero copy from HDFS.
-
Constructor Details
-
Options
public Options()- Since:
- 1.1.0
-
Options
- Since:
- 1.1.0
-
-
Method Details
-
include
Set the list of columns to read.- Parameters:
include
- a list of columns to read- Returns:
- this
- Since:
- 1.1.0
-
range
Set the range of bytes to read- Parameters:
offset
- the starting byte offsetlength
- the number of bytes to read- Returns:
- this
- Since:
- 1.1.0
-
schema
Set the schema on read type description.- Since:
- 1.1.0
-
setRowFilter
public Reader.Options setRowFilter(String[] filterColumnNames, Consumer<OrcFilterContext> filterCallback) Set a row level filter. This is an advanced feature that allows the caller to specify a list of columns that are read first and then a filter that is called to determine which rows if any should be read. User should expect the batches that come from the reader to use the selected array set by their filter. Use cases for this are predicates that SearchArgs can't represent, such as relationships between columns (eg. columnA == columnB).- Parameters:
filterColumnNames
- a comma separated list of the column names that are read before the filter is applied. Only top level columns in the reader's schema can be used here and they must not be duplicated.filterCallback
- a function callback to perform filtering during the call to RecordReader.nextBatch. This function should not reference any static fields nor modify the passed in ColumnVectors but should set the filter output using the selected array.- Returns:
- this
- Since:
- 1.7.0
-
searchArgument
public Reader.Options searchArgument(org.apache.hadoop.hive.ql.io.sarg.SearchArgument sarg, String[] columnNames) Set search argument for predicate push down.- Parameters:
sarg
- the search argumentcolumnNames
- the column names for- Returns:
- this
- Since:
- 1.1.0
-
allowSARGToFilter
Set allowSARGToFilter.- Parameters:
allowSARGToFilter
-- Returns:
- this
- Since:
- 1.7.0
-
isAllowSARGToFilter
public boolean isAllowSARGToFilter()Get allowSARGToFilter value.- Returns:
- allowSARGToFilter
- Since:
- 1.7.0
-
useZeroCopy
Set whether to use zero copy from HDFS.- Parameters:
value
- the new zero copy flag- Returns:
- this
- Since:
- 1.1.0
-
dataReader
Set dataReader.- Parameters:
value
- the new dataReader.- Returns:
- this
- Since:
- 1.1.0
-
skipCorruptRecords
Set whether to skip corrupt records.- Parameters:
value
- the new skip corrupt records flag- Returns:
- this
- Since:
- 1.1.0
-
tolerateMissingSchema
Set whether to make a best effort to tolerate schema evolution for files which do not have an embedded schema because they were written with a' pre-HIVE-4243 writer.- Parameters:
value
- the new tolerance flag- Returns:
- this
- Since:
- 1.2.0
-
forcePositionalEvolution
Set whether to force schema evolution to be positional instead of based on the column names.- Parameters:
value
- force positional evolution- Returns:
- this
- Since:
- 1.3.0
-
positionalEvolutionLevel
Set number of levels to force schema evolution to be positional instead of based on the column names.- Parameters:
value
- number of levels of positional schema evolution- Returns:
- this
- Since:
- 1.5.11
-
isSchemaEvolutionCaseAware
Set boolean flag to determine if the comparison of field names in schema evolution is case sensitive- Parameters:
value
- the flag for schema evolution is case sensitive or not.- Returns:
- this
- Since:
- 1.5.0
-
includeAcidColumns
true
if acid metadata columns should be decoded otherwise they will be set tonull
.- Since:
- 1.5.3
-
getInclude
public boolean[] getInclude()- Since:
- 1.1.0
-
getOffset
public long getOffset()- Since:
- 1.1.0
-
getLength
public long getLength()- Since:
- 1.1.0
-
getSchema
- Since:
- 1.1.0
-
getSearchArgument
public org.apache.hadoop.hive.ql.io.sarg.SearchArgument getSearchArgument()- Since:
- 1.1.0
-
getFilterCallback
- Since:
- 1.7.0
-
getPreFilterColumnNames
- Since:
- 1.7.0
-
getColumnNames
- Since:
- 1.1.0
-
getMaxOffset
public long getMaxOffset()- Since:
- 1.1.0
-
getUseZeroCopy
- Since:
- 1.1.0
-
getSkipCorruptRecords
- Since:
- 1.1.0
-
getDataReader
- Since:
- 1.1.0
-
getForcePositionalEvolution
public boolean getForcePositionalEvolution()- Since:
- 1.3.0
-
getPositionalEvolutionLevel
public int getPositionalEvolutionLevel()- Since:
- 1.5.11
-
getIsSchemaEvolutionCaseAware
public boolean getIsSchemaEvolutionCaseAware()- Since:
- 1.5.0
-
getIncludeAcidColumns
public boolean getIncludeAcidColumns()- Since:
- 1.5.3
-
clone
-
toString
-
getTolerateMissingSchema
public boolean getTolerateMissingSchema()- Since:
- 1.2.0
-
useSelected
public boolean useSelected()- Since:
- 1.7.0
-
useSelected
- Since:
- 1.7.0
-
allowPluginFilters
public boolean allowPluginFilters() -
allowPluginFilters
-
pluginAllowListFilters
-
pluginAllowListFilters
-
minSeekSize
public int minSeekSize()- Since:
- 1.8.0
-
minSeekSize
- Since:
- 1.8.0
-
minSeekSizeTolerance
public double minSeekSizeTolerance()- Since:
- 1.8.0
-
minSeekSizeTolerance
- Since:
- 1.8.0
-
getRowBatchSize
public int getRowBatchSize()- Since:
- 1.9.0
-
rowBatchSize
- Since:
- 1.9.0
-