java.lang.Object

org.apache.orc.impl.RecordReaderImpl

All Implemented Interfaces:: Closeable, AutoCloseable, RecordReader

public class RecordReaderImpl extends Object implements RecordReader

Nested Class Summary

Nested Classes

Modifier and Type

Class

Description

static final class

RecordReaderImpl.PositionProviderImpl

static class

RecordReaderImpl.SargApplier

static final class

RecordReaderImpl.ZeroPositionProvider
Field Summary

Fields

Modifier and Type

Field

Description

static final OrcProto.ColumnStatistics

EMPTY_COLUMN_STATISTICS

protected final Path

path

protected final TypeDescription

schema
Constructor Summary

Constructors

Modifier

Constructor

Description

protected

RecordReaderImpl(ReaderImpl fileReader, Reader.Options options)
Method Summary

Modifier and Type

Method

Description

void

close()

Release the resources associated with the given reader.

static String

encodeTranslatedSargColumn(int rootColumn, Integer indexInSourceTable)

static org.apache.hadoop.hive.ql.io.sarg.SearchArgument.TruthValue

evaluatePredicate(ColumnStatistics stats, org.apache.hadoop.hive.ql.io.sarg.PredicateLeaf predicate, BloomFilter bloomFilter)

Evaluate a predicate with respect to the statistics from the column that is referenced in the predicate.

static org.apache.hadoop.hive.ql.io.sarg.SearchArgument.TruthValue

evaluatePredicate(ColumnStatistics stats, org.apache.hadoop.hive.ql.io.sarg.PredicateLeaf predicate, BloomFilter bloomFilter, boolean useUTCTimestamp)

Evaluate a predicate with respect to the statistics from the column that is referenced in the predicate.

CompressionCodec

getCompressionCodec()

int

getMaxDiskRangeChunkLimit()

float

getProgress()

Return the fraction of rows that have been read from the selected.

long

getRowNumber()

Get the row number of the row that will be returned by the following call to next().

static int[]

mapSargColumnsToOrcInternalColIdx(List<org.apache.hadoop.hive.ql.io.sarg.PredicateLeaf> sargLeaves, SchemaEvolution evolution)

Find the mapping from predicate leaves to columns.

static int[]

mapTranslatedSargColumns(List<OrcProto.Type> types, List<org.apache.hadoop.hive.ql.io.sarg.PredicateLeaf> sargLeaves)

boolean

nextBatch(org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch batch)

Read the next row batch.

protected boolean[]

pickRowGroups()

Pick the row groups that we need to load from the current stripe.

OrcIndex

readRowIndex(int stripeIndex, boolean[] included, boolean[] readCols)

OrcProto.StripeFooter

readStripeFooter(StripeInformation stripe)

void

seekToRow(long rowNumber)

Seek to a particular row number.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Field Details
- EMPTY_COLUMN_STATISTICS
  
  public static final OrcProto.ColumnStatistics EMPTY_COLUMN_STATISTICS
- path
  
  protected final Path path
- schema
  
  protected final TypeDescription schema
Constructor Details
- RecordReaderImpl
  
  protected RecordReaderImpl(ReaderImpl fileReader, Reader.Options options) throws IOException
  
  Throws:
  
  IOException
Method Details
- mapSargColumnsToOrcInternalColIdx
  
  public static int[] mapSargColumnsToOrcInternalColIdx(List<org.apache.hadoop.hive.ql.io.sarg.PredicateLeaf> sargLeaves, SchemaEvolution evolution)
  
  Find the mapping from predicate leaves to columns.
  
  Parameters:
  
  sargLeaves - the search argument that we need to map
  
  evolution - the mapping from reader to file schema
  
  Returns:
  
  an array mapping the sarg leaves to concrete column numbers in the file
- readStripeFooter
  
  public OrcProto.StripeFooter readStripeFooter(StripeInformation stripe) throws IOException
  
  Throws:
  
  IOException
- evaluatePredicate
  
  public static org.apache.hadoop.hive.ql.io.sarg.SearchArgument.TruthValue evaluatePredicate(ColumnStatistics stats, org.apache.hadoop.hive.ql.io.sarg.PredicateLeaf predicate, BloomFilter bloomFilter)
  
  Evaluate a predicate with respect to the statistics from the column that is referenced in the predicate.
  
  Parameters:
  
  stats - the statistics for the column mentioned in the predicate
  
  predicate - the leaf predicate we need to evaluation
  
  Returns:
  
  the set of truth values that may be returned for the given predicate.
- evaluatePredicate
  
  public static org.apache.hadoop.hive.ql.io.sarg.SearchArgument.TruthValue evaluatePredicate(ColumnStatistics stats, org.apache.hadoop.hive.ql.io.sarg.PredicateLeaf predicate, BloomFilter bloomFilter, boolean useUTCTimestamp)
  
  Evaluate a predicate with respect to the statistics from the column that is referenced in the predicate. Includes option to specify if timestamp column stats values should be in UTC.
  
  Parameters:
  
  stats - the statistics for the column mentioned in the predicate
  
  predicate - the leaf predicate we need to evaluation
  
  bloomFilter -
  
  useUTCTimestamp -
  
  Returns:
  
  the set of truth values that may be returned for the given predicate.
- pickRowGroups
  
  protected boolean[] pickRowGroups() throws IOException
  
  Pick the row groups that we need to load from the current stripe.
  
  Returns:
  
  an array with a boolean for each row group or null if all of the row groups must be read.
  
  Throws:
  
  IOException
- nextBatch
  
  public boolean nextBatch(org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch batch) throws IOException
  
  Description copied from interface: RecordReader
  
  Read the next row batch. The size of the batch to read cannot be controlled by the callers. Caller need to look at VectorizedRowBatch.size of the returned object to know the batch size read.
  
  Specified by:
  
  nextBatch in interface RecordReader
  
  Parameters:
  
  batch - a row batch object to read into
  
  Returns:
  
  were more rows available to read?
  
  Throws:
  
  IOException
- close
  
  public void close() throws IOException
  
  Description copied from interface: RecordReader
  
  Release the resources associated with the given reader.
  
  Specified by:
  
  close in interface AutoCloseable
  
  Specified by:
  
  close in interface Closeable
  
  Specified by:
  
  close in interface RecordReader
  
  Throws:
  
  IOException
- getRowNumber
  
  public long getRowNumber()
  
  Description copied from interface: RecordReader
  
  Get the row number of the row that will be returned by the following call to next().
  
  Specified by:
  
  getRowNumber in interface RecordReader
  
  Returns:
  
  the row number from 0 to the number of rows in the file
- getProgress
  
  public float getProgress()
  
  Return the fraction of rows that have been read from the selected. section of the file
  
  Specified by:
  
  getProgress in interface RecordReader
  
  Returns:
  
  fraction between 0.0 and 1.0 of rows consumed
- readRowIndex
  
  public OrcIndex readRowIndex(int stripeIndex, boolean[] included, boolean[] readCols) throws IOException
  
  Throws:
  
  IOException
- seekToRow
  
  public void seekToRow(long rowNumber) throws IOException
  
  Description copied from interface: RecordReader
  
  Seek to a particular row number.
  
  Specified by:
  
  seekToRow in interface RecordReader
  
  Throws:
  
  IOException
- encodeTranslatedSargColumn
  
  public static String encodeTranslatedSargColumn(int rootColumn, Integer indexInSourceTable)
- mapTranslatedSargColumns
  
  public static int[] mapTranslatedSargColumns(List<OrcProto.Type> types, List<org.apache.hadoop.hive.ql.io.sarg.PredicateLeaf> sargLeaves)
- getCompressionCodec
  
  public CompressionCodec getCompressionCodec()
- getMaxDiskRangeChunkLimit
  
  public int getMaxDiskRangeChunkLimit()

Class RecordReaderImpl

Nested Class Summary

Field Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Field Details

EMPTY_COLUMN_STATISTICS

path

schema

Constructor Details

RecordReaderImpl

Method Details

mapSargColumnsToOrcInternalColIdx

readStripeFooter

evaluatePredicate

evaluatePredicate

pickRowGroups

nextBatch

close

getRowNumber

getProgress

readRowIndex

seekToRow

encodeTranslatedSargColumn

mapTranslatedSargColumns

getCompressionCodec

getMaxDiskRangeChunkLimit