Class RecordReaderImpl

java.lang.Object
org.apache.orc.impl.RecordReaderImpl
All Implemented Interfaces:
Closeable, AutoCloseable, RecordReader

public class RecordReaderImpl extends Object implements RecordReader
  • Field Details

  • Constructor Details

  • Method Details

    • mapSargColumnsToOrcInternalColIdx

      public static int[] mapSargColumnsToOrcInternalColIdx(List<org.apache.hadoop.hive.ql.io.sarg.PredicateLeaf> sargLeaves, SchemaEvolution evolution)
      Find the mapping from predicate leaves to columns.
      Parameters:
      sargLeaves - the search argument that we need to map
      evolution - the mapping from reader to file schema
      Returns:
      an array mapping the sarg leaves to concrete column numbers in the file
    • readStripeFooter

      public OrcProto.StripeFooter readStripeFooter(StripeInformation stripe) throws IOException
      Throws:
      IOException
    • evaluatePredicate

      public static org.apache.hadoop.hive.ql.io.sarg.SearchArgument.TruthValue evaluatePredicate(ColumnStatistics stats, org.apache.hadoop.hive.ql.io.sarg.PredicateLeaf predicate, BloomFilter bloomFilter)
      Evaluate a predicate with respect to the statistics from the column that is referenced in the predicate.
      Parameters:
      stats - the statistics for the column mentioned in the predicate
      predicate - the leaf predicate we need to evaluation
      Returns:
      the set of truth values that may be returned for the given predicate.
    • evaluatePredicate

      public static org.apache.hadoop.hive.ql.io.sarg.SearchArgument.TruthValue evaluatePredicate(ColumnStatistics stats, org.apache.hadoop.hive.ql.io.sarg.PredicateLeaf predicate, BloomFilter bloomFilter, boolean useUTCTimestamp)
      Evaluate a predicate with respect to the statistics from the column that is referenced in the predicate. Includes option to specify if timestamp column stats values should be in UTC.
      Parameters:
      stats - the statistics for the column mentioned in the predicate
      predicate - the leaf predicate we need to evaluation
      bloomFilter -
      useUTCTimestamp -
      Returns:
      the set of truth values that may be returned for the given predicate.
    • pickRowGroups

      protected boolean[] pickRowGroups() throws IOException
      Pick the row groups that we need to load from the current stripe.
      Returns:
      an array with a boolean for each row group or null if all of the row groups must be read.
      Throws:
      IOException
    • nextBatch

      public boolean nextBatch(org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch batch) throws IOException
      Description copied from interface: RecordReader
      Read the next row batch. The size of the batch to read cannot be controlled by the callers. Caller need to look at VectorizedRowBatch.size of the returned object to know the batch size read.
      Specified by:
      nextBatch in interface RecordReader
      Parameters:
      batch - a row batch object to read into
      Returns:
      were more rows available to read?
      Throws:
      IOException
    • close

      public void close() throws IOException
      Description copied from interface: RecordReader
      Release the resources associated with the given reader.
      Specified by:
      close in interface AutoCloseable
      Specified by:
      close in interface Closeable
      Specified by:
      close in interface RecordReader
      Throws:
      IOException
    • getRowNumber

      public long getRowNumber()
      Description copied from interface: RecordReader
      Get the row number of the row that will be returned by the following call to next().
      Specified by:
      getRowNumber in interface RecordReader
      Returns:
      the row number from 0 to the number of rows in the file
    • getProgress

      public float getProgress()
      Return the fraction of rows that have been read from the selected. section of the file
      Specified by:
      getProgress in interface RecordReader
      Returns:
      fraction between 0.0 and 1.0 of rows consumed
    • readRowIndex

      public OrcIndex readRowIndex(int stripeIndex, boolean[] included, boolean[] readCols) throws IOException
      Throws:
      IOException
    • seekToRow

      public void seekToRow(long rowNumber) throws IOException
      Description copied from interface: RecordReader
      Seek to a particular row number.
      Specified by:
      seekToRow in interface RecordReader
      Throws:
      IOException
    • encodeTranslatedSargColumn

      public static String encodeTranslatedSargColumn(int rootColumn, Integer indexInSourceTable)
    • mapTranslatedSargColumns

      public static int[] mapTranslatedSargColumns(List<OrcProto.Type> types, List<org.apache.hadoop.hive.ql.io.sarg.PredicateLeaf> sargLeaves)
    • getCompressionCodec

      public CompressionCodec getCompressionCodec()
    • getMaxDiskRangeChunkLimit

      public int getMaxDiskRangeChunkLimit()