java.lang.Object

org.apache.orc.Reader.Options

All Implemented Interfaces:: Cloneable

Enclosing interface:: Reader

public static class Reader.Options extends Object implements Cloneable

Options for creating a RecordReader.

Since:: 1.1.0

Constructor Summary

Constructors

Constructor

Description

Options()

Options(Configuration conf)
Method Summary

Modifier and Type

Method

Description

boolean

allowPluginFilters()

Reader.Options

allowPluginFilters(boolean allowPluginFilters)

Reader.Options

allowSARGToFilter(boolean allowSARGToFilter)

Set allowSARGToFilter.

Reader.Options

clone()

Reader.Options

dataReader(DataReader value)

Set dataReader.

Reader.Options

forcePositionalEvolution(boolean value)

Set whether to force schema evolution to be positional instead of based on the column names.

String[]

getColumnNames()

DataReader

getDataReader()

Consumer<OrcFilterContext>

getFilterCallback()

boolean

getForcePositionalEvolution()

boolean[]

getInclude()

boolean

getIncludeAcidColumns()

boolean

getIsSchemaEvolutionCaseAware()

long

getLength()

long

getMaxOffset()

long

getOffset()

int

getPositionalEvolutionLevel()

String[]

getPreFilterColumnNames()

int

getRowBatchSize()

TypeDescription

getSchema()

org.apache.hadoop.hive.ql.io.sarg.SearchArgument

getSearchArgument()

Boolean

getSkipCorruptRecords()

boolean

getTolerateMissingSchema()

Boolean

getUseZeroCopy()

Reader.Options

include(boolean[] include)

Set the list of columns to read.

Reader.Options

includeAcidColumns(boolean includeAcidColumns)

true if acid metadata columns should be decoded otherwise they will be set to null.

boolean

isAllowSARGToFilter()

Get allowSARGToFilter value.

Reader.Options

isSchemaEvolutionCaseAware(boolean value)

Set boolean flag to determine if the comparison of field names in schema evolution is case sensitive

int

minSeekSize()

Reader.Options

minSeekSize(int minSeekSize)

double

minSeekSizeTolerance()

Reader.Options

minSeekSizeTolerance(double value)

List<String>

pluginAllowListFilters()

Reader.Options

pluginAllowListFilters(String... allowLists)

Reader.Options

positionalEvolutionLevel(int value)

Set number of levels to force schema evolution to be positional instead of based on the column names.

Reader.Options

range(long offset, long length)

Set the range of bytes to read

Reader.Options

rowBatchSize(int value)

Reader.Options

schema(TypeDescription schema)

Set the schema on read type description.

Reader.Options

searchArgument(org.apache.hadoop.hive.ql.io.sarg.SearchArgument sarg, String[] columnNames)

Set search argument for predicate push down.

Reader.Options

setRowFilter(String[] filterColumnNames, Consumer<OrcFilterContext> filterCallback)

Set a row level filter.

Reader.Options

skipCorruptRecords(boolean value)

Set whether to skip corrupt records.

Reader.Options

tolerateMissingSchema(boolean value)

Set whether to make a best effort to tolerate schema evolution for files which do not have an embedded schema because they were written with a' pre-HIVE-4243 writer.

String

toString()

boolean

useSelected()

Reader.Options

useSelected(boolean newValue)

Reader.Options

useZeroCopy(boolean value)

Set whether to use zero copy from HDFS.

Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait

Constructor Details
- Options
  
  public Options()
  
  Since:
  
  1.1.0
- Options
  
  public Options(Configuration conf)
  
  Since:
  
  1.1.0
Method Details
- include
  
  public Reader.Options include(boolean[] include)
  
  Set the list of columns to read.
  
  Parameters:
  
  include - a list of columns to read
  
  Returns:
  
  this
  
  Since:
  
  1.1.0
- range
  
  public Reader.Options range(long offset, long length)
  
  Set the range of bytes to read
  
  Parameters:
  
  offset - the starting byte offset
  
  length - the number of bytes to read
  
  Returns:
  
  this
  
  Since:
  
  1.1.0
- schema
  
  public Reader.Options schema(TypeDescription schema)
  
  Set the schema on read type description.
  
  Since:
  
  1.1.0
- setRowFilter
  
  public Reader.Options setRowFilter(String[] filterColumnNames, Consumer<OrcFilterContext> filterCallback)
  
  Set a row level filter. This is an advanced feature that allows the caller to specify a list of columns that are read first and then a filter that is called to determine which rows if any should be read. User should expect the batches that come from the reader to use the selected array set by their filter. Use cases for this are predicates that SearchArgs can't represent, such as relationships between columns (eg. columnA == columnB).
  
  Parameters:
  
  filterColumnNames - a comma separated list of the column names that are read before the filter is applied. Only top level columns in the reader's schema can be used here and they must not be duplicated.
  
  filterCallback - a function callback to perform filtering during the call to RecordReader.nextBatch. This function should not reference any static fields nor modify the passed in ColumnVectors but should set the filter output using the selected array.
  
  Returns:
  
  this
  
  Since:
  
  1.7.0
- searchArgument
  
  public Reader.Options searchArgument(org.apache.hadoop.hive.ql.io.sarg.SearchArgument sarg, String[] columnNames)
  
  Set search argument for predicate push down.
  
  Parameters:
  
  sarg - the search argument
  
  columnNames - the column names for
  
  Returns:
  
  this
  
  Since:
  
  1.1.0
- allowSARGToFilter
  
  public Reader.Options allowSARGToFilter(boolean allowSARGToFilter)
  
  Set allowSARGToFilter.
  
  Parameters:
  
  allowSARGToFilter -
  
  Returns:
  
  this
  
  Since:
  
  1.7.0
- isAllowSARGToFilter
  
  public boolean isAllowSARGToFilter()
  
  Get allowSARGToFilter value.
  
  Returns:
  
  allowSARGToFilter
  
  Since:
  
  1.7.0
- useZeroCopy
  
  public Reader.Options useZeroCopy(boolean value)
  
  Set whether to use zero copy from HDFS.
  
  Parameters:
  
  value - the new zero copy flag
  
  Returns:
  
  this
  
  Since:
  
  1.1.0
- dataReader
  
  public Reader.Options dataReader(DataReader value)
  
  Set dataReader.
  
  Parameters:
  
  value - the new dataReader.
  
  Returns:
  
  this
  
  Since:
  
  1.1.0
- skipCorruptRecords
  
  public Reader.Options skipCorruptRecords(boolean value)
  
  Set whether to skip corrupt records.
  
  Parameters:
  
  value - the new skip corrupt records flag
  
  Returns:
  
  this
  
  Since:
  
  1.1.0
- tolerateMissingSchema
  
  public Reader.Options tolerateMissingSchema(boolean value)
  
  Set whether to make a best effort to tolerate schema evolution for files which do not have an embedded schema because they were written with a' pre-HIVE-4243 writer.
  
  Parameters:
  
  value - the new tolerance flag
  
  Returns:
  
  this
  
  Since:
  
  1.2.0
- forcePositionalEvolution
  
  public Reader.Options forcePositionalEvolution(boolean value)
  
  Set whether to force schema evolution to be positional instead of based on the column names.
  
  Parameters:
  
  value - force positional evolution
  
  Returns:
  
  this
  
  Since:
  
  1.3.0
- positionalEvolutionLevel
  
  public Reader.Options positionalEvolutionLevel(int value)
  
  Set number of levels to force schema evolution to be positional instead of based on the column names.
  
  Parameters:
  
  value - number of levels of positional schema evolution
  
  Returns:
  
  this
  
  Since:
  
  1.5.11
- isSchemaEvolutionCaseAware
  
  public Reader.Options isSchemaEvolutionCaseAware(boolean value)
  
  Set boolean flag to determine if the comparison of field names in schema evolution is case sensitive
  
  Parameters:
  
  value - the flag for schema evolution is case sensitive or not.
  
  Returns:
  
  this
  
  Since:
  
  1.5.0
- includeAcidColumns
  
  public Reader.Options includeAcidColumns(boolean includeAcidColumns)
  
  true if acid metadata columns should be decoded otherwise they will be set to null.
  
  Since:
  
  1.5.3
- getInclude
  
  public boolean[] getInclude()
  
  Since:
  
  1.1.0
- getOffset
  
  public long getOffset()
  
  Since:
  
  1.1.0
- getLength
  
  public long getLength()
  
  Since:
  
  1.1.0
- getSchema
  
  public TypeDescription getSchema()
  
  Since:
  
  1.1.0
- getSearchArgument
  
  public org.apache.hadoop.hive.ql.io.sarg.SearchArgument getSearchArgument()
  
  Since:
  
  1.1.0
- getFilterCallback
  
  public Consumer<OrcFilterContext> getFilterCallback()
  
  Since:
  
  1.7.0
- getPreFilterColumnNames
  
  public String[] getPreFilterColumnNames()
  
  Since:
  
  1.7.0
- getColumnNames
  
  public String[] getColumnNames()
  
  Since:
  
  1.1.0
- getMaxOffset
  
  public long getMaxOffset()
  
  Since:
  
  1.1.0
- getUseZeroCopy
  
  public Boolean getUseZeroCopy()
  
  Since:
  
  1.1.0
- getSkipCorruptRecords
  
  public Boolean getSkipCorruptRecords()
  
  Since:
  
  1.1.0
- getDataReader
  
  public DataReader getDataReader()
  
  Since:
  
  1.1.0
- getForcePositionalEvolution
  
  public boolean getForcePositionalEvolution()
  
  Since:
  
  1.3.0
- getPositionalEvolutionLevel
  
  public int getPositionalEvolutionLevel()
  
  Since:
  
  1.5.11
- getIsSchemaEvolutionCaseAware
  
  public boolean getIsSchemaEvolutionCaseAware()
  
  Since:
  
  1.5.0
- getIncludeAcidColumns
  
  public boolean getIncludeAcidColumns()
  
  Since:
  
  1.5.3
- clone
  
  public Reader.Options clone()
  
  Overrides:
  
  clone in class Object
  
  Since:
  
  1.1.0
- toString
  
  public String toString()
  
  Overrides:
  
  toString in class Object
  
  Since:
  
  1.1.0
- getTolerateMissingSchema
  
  public boolean getTolerateMissingSchema()
  
  Since:
  
  1.2.0
- useSelected
  
  public boolean useSelected()
  
  Since:
  
  1.7.0
- useSelected
  
  public Reader.Options useSelected(boolean newValue)
  
  Since:
  
  1.7.0
- allowPluginFilters
  
  public boolean allowPluginFilters()
- allowPluginFilters
  
  public Reader.Options allowPluginFilters(boolean allowPluginFilters)
- pluginAllowListFilters
  
  public List<String> pluginAllowListFilters()
- pluginAllowListFilters
  
  public Reader.Options pluginAllowListFilters(String... allowLists)
- minSeekSize
  
  public int minSeekSize()
  
  Since:
  
  1.8.0
- minSeekSize
  
  public Reader.Options minSeekSize(int minSeekSize)
  
  Since:
  
  1.8.0
- minSeekSizeTolerance
  
  public double minSeekSizeTolerance()
  
  Since:
  
  1.8.0
- minSeekSizeTolerance
  
  public Reader.Options minSeekSizeTolerance(double value)
  
  Since:
  
  1.8.0
- getRowBatchSize
  
  public int getRowBatchSize()
  
  Since:
  
  1.9.0
- rowBatchSize
  
  public Reader.Options rowBatchSize(int value)
  
  Since:
  
  1.9.0

Class Reader.Options

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Details

Options

Options

Method Details

include

range

schema

setRowFilter

searchArgument

allowSARGToFilter

isAllowSARGToFilter

useZeroCopy

dataReader

skipCorruptRecords

tolerateMissingSchema

forcePositionalEvolution

positionalEvolutionLevel

isSchemaEvolutionCaseAware

includeAcidColumns

getInclude

getOffset

getLength

getSchema

getSearchArgument

getFilterCallback

getPreFilterColumnNames

getColumnNames

getMaxOffset

getUseZeroCopy

getSkipCorruptRecords

getDataReader

getForcePositionalEvolution

getPositionalEvolutionLevel

getIsSchemaEvolutionCaseAware

getIncludeAcidColumns

clone

toString

getTolerateMissingSchema

useSelected

useSelected

allowPluginFilters

allowPluginFilters

pluginAllowListFilters

pluginAllowListFilters

minSeekSize

minSeekSize

minSeekSizeTolerance

minSeekSizeTolerance

getRowBatchSize

rowBatchSize