Package org.apache.orc.impl
Class RecordReaderImpl
java.lang.Object
org.apache.orc.impl.RecordReaderImpl
- All Implemented Interfaces:
Closeable
,AutoCloseable
,RecordReader
-
Nested Class Summary
Modifier and TypeClassDescriptionstatic final class
static class
static final class
-
Field Summary
Modifier and TypeFieldDescriptionstatic final OrcProto.ColumnStatistics
protected final Path
protected final TypeDescription
-
Constructor Summary
ModifierConstructorDescriptionprotected
RecordReaderImpl
(ReaderImpl fileReader, Reader.Options options) -
Method Summary
Modifier and TypeMethodDescriptionvoid
close()
Release the resources associated with the given reader.static String
encodeTranslatedSargColumn
(int rootColumn, Integer indexInSourceTable) static org.apache.hadoop.hive.ql.io.sarg.SearchArgument.TruthValue
evaluatePredicate
(ColumnStatistics stats, org.apache.hadoop.hive.ql.io.sarg.PredicateLeaf predicate, BloomFilter bloomFilter) Evaluate a predicate with respect to the statistics from the column that is referenced in the predicate.static org.apache.hadoop.hive.ql.io.sarg.SearchArgument.TruthValue
evaluatePredicate
(ColumnStatistics stats, org.apache.hadoop.hive.ql.io.sarg.PredicateLeaf predicate, BloomFilter bloomFilter, boolean useUTCTimestamp) Evaluate a predicate with respect to the statistics from the column that is referenced in the predicate.int
float
Return the fraction of rows that have been read from the selected.long
Get the row number of the row that will be returned by the following call to next().static int[]
mapSargColumnsToOrcInternalColIdx
(List<org.apache.hadoop.hive.ql.io.sarg.PredicateLeaf> sargLeaves, SchemaEvolution evolution) Find the mapping from predicate leaves to columns.static int[]
mapTranslatedSargColumns
(List<OrcProto.Type> types, List<org.apache.hadoop.hive.ql.io.sarg.PredicateLeaf> sargLeaves) boolean
nextBatch
(org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch batch) Read the next row batch.protected boolean[]
Pick the row groups that we need to load from the current stripe.readRowIndex
(int stripeIndex, boolean[] included, boolean[] readCols) readStripeFooter
(StripeInformation stripe) void
seekToRow
(long rowNumber) Seek to a particular row number.
-
Field Details
-
EMPTY_COLUMN_STATISTICS
-
path
-
schema
-
-
Constructor Details
-
RecordReaderImpl
- Throws:
IOException
-
-
Method Details
-
mapSargColumnsToOrcInternalColIdx
public static int[] mapSargColumnsToOrcInternalColIdx(List<org.apache.hadoop.hive.ql.io.sarg.PredicateLeaf> sargLeaves, SchemaEvolution evolution) Find the mapping from predicate leaves to columns.- Parameters:
sargLeaves
- the search argument that we need to mapevolution
- the mapping from reader to file schema- Returns:
- an array mapping the sarg leaves to concrete column numbers in the file
-
evaluatePredicate
public static org.apache.hadoop.hive.ql.io.sarg.SearchArgument.TruthValue evaluatePredicate(ColumnStatistics stats, org.apache.hadoop.hive.ql.io.sarg.PredicateLeaf predicate, BloomFilter bloomFilter) Evaluate a predicate with respect to the statistics from the column that is referenced in the predicate.- Parameters:
stats
- the statistics for the column mentioned in the predicatepredicate
- the leaf predicate we need to evaluation- Returns:
- the set of truth values that may be returned for the given predicate.
-
evaluatePredicate
public static org.apache.hadoop.hive.ql.io.sarg.SearchArgument.TruthValue evaluatePredicate(ColumnStatistics stats, org.apache.hadoop.hive.ql.io.sarg.PredicateLeaf predicate, BloomFilter bloomFilter, boolean useUTCTimestamp) Evaluate a predicate with respect to the statistics from the column that is referenced in the predicate. Includes option to specify if timestamp column stats values should be in UTC.- Parameters:
stats
- the statistics for the column mentioned in the predicatepredicate
- the leaf predicate we need to evaluationbloomFilter
-useUTCTimestamp
-- Returns:
- the set of truth values that may be returned for the given predicate.
-
pickRowGroups
Pick the row groups that we need to load from the current stripe.- Returns:
- an array with a boolean for each row group or null if all of the row groups must be read.
- Throws:
IOException
-
nextBatch
public boolean nextBatch(org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch batch) throws IOException Description copied from interface:RecordReader
Read the next row batch. The size of the batch to read cannot be controlled by the callers. Caller need to look at VectorizedRowBatch.size of the returned object to know the batch size read.- Specified by:
nextBatch
in interfaceRecordReader
- Parameters:
batch
- a row batch object to read into- Returns:
- were more rows available to read?
- Throws:
IOException
-
close
Description copied from interface:RecordReader
Release the resources associated with the given reader.- Specified by:
close
in interfaceAutoCloseable
- Specified by:
close
in interfaceCloseable
- Specified by:
close
in interfaceRecordReader
- Throws:
IOException
-
getRowNumber
public long getRowNumber()Description copied from interface:RecordReader
Get the row number of the row that will be returned by the following call to next().- Specified by:
getRowNumber
in interfaceRecordReader
- Returns:
- the row number from 0 to the number of rows in the file
-
getProgress
public float getProgress()Return the fraction of rows that have been read from the selected. section of the file- Specified by:
getProgress
in interfaceRecordReader
- Returns:
- fraction between 0.0 and 1.0 of rows consumed
-
readRowIndex
public OrcIndex readRowIndex(int stripeIndex, boolean[] included, boolean[] readCols) throws IOException - Throws:
IOException
-
seekToRow
Description copied from interface:RecordReader
Seek to a particular row number.- Specified by:
seekToRow
in interfaceRecordReader
- Throws:
IOException
-
encodeTranslatedSargColumn
-
mapTranslatedSargColumns
public static int[] mapTranslatedSargColumns(List<OrcProto.Type> types, List<org.apache.hadoop.hive.ql.io.sarg.PredicateLeaf> sargLeaves) -
getCompressionCodec
-
getMaxDiskRangeChunkLimit
public int getMaxDiskRangeChunkLimit()
-