java.lang.Object

org.apache.orc.impl.reader.StripePlanner

public class StripePlanner extends Object

This class handles parsing the stripe information and handling the necessary filtering and selection.

It supports:

column projection
row group selection
encryption

Nested Class Summary

Nested Classes

Modifier and Type

Class

Description

static class

StripePlanner.StreamInformation
Constructor Summary

Constructors

Constructor

Description

StripePlanner(StripePlanner old)

StripePlanner(TypeDescription schema, ReaderEncryption encryption, DataReader dataReader, OrcFile.WriterVersion version, boolean ignoreNonUtf8BloomFilter, long maxBufferSize)

StripePlanner(TypeDescription schema, ReaderEncryption encryption, DataReader dataReader, OrcFile.WriterVersion version, boolean ignoreNonUtf8BloomFilter, long maxBufferSize, Set<Integer> filterColIds)

Create a stripe parser.
Method Summary

Modifier and Type

Method

Description

void

clearStreams()

Release all of the buffers for the current stripe.

OrcProto.ColumnEncoding

getEncoding(int column)

InStream

getStream(StreamName name)

Get the stream for the given name.

String

getWriterTimezone()

StripePlanner

parseStripe(StripeInformation stripe, boolean[] columnInclude)

Parse a new stripe.

BufferChunkList

readData(OrcIndex index, boolean[] rowGroupInclude, boolean forceDirect, TypeReader.ReadPhase readPhase)

Read the stripe data from the file.

BufferChunkList

readFollowData(OrcIndex index, boolean[] rowGroupInclude, int rgIdx, boolean forceDirect)

OrcIndex

readRowIndex(boolean[] sargColumns, OrcIndex output)

Read and parse the indexes for the current stripe.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Details
- StripePlanner
  
  public StripePlanner(TypeDescription schema, ReaderEncryption encryption, DataReader dataReader, OrcFile.WriterVersion version, boolean ignoreNonUtf8BloomFilter, long maxBufferSize, Set<Integer> filterColIds)
  
  Create a stripe parser.
  
  Parameters:
  
  schema - the file schema
  
  encryption - the encryption information
  
  dataReader - the underlying data reader
  
  version - the file writer version
  
  ignoreNonUtf8BloomFilter - ignore old non-utf8 bloom filters
  
  maxBufferSize - the largest single buffer to use
  
  filterColIds - Column Ids that identify the filter columns
- StripePlanner
  
  public StripePlanner(TypeDescription schema, ReaderEncryption encryption, DataReader dataReader, OrcFile.WriterVersion version, boolean ignoreNonUtf8BloomFilter, long maxBufferSize)
- StripePlanner
  
  public StripePlanner(StripePlanner old)
Method Details
- parseStripe
  
  public StripePlanner parseStripe(StripeInformation stripe, boolean[] columnInclude) throws IOException
  
  Parse a new stripe. Resets the current stripe state.
  
  Parameters:
  
  stripe - the new stripe
  
  columnInclude - an array with true for each column to read
  
  Returns:
  
  this for method chaining
  
  Throws:
  
  IOException
- readData
  
  public BufferChunkList readData(OrcIndex index, boolean[] rowGroupInclude, boolean forceDirect, TypeReader.ReadPhase readPhase) throws IOException
  
  Read the stripe data from the file.
  
  Parameters:
  
  index - null for no row filters or the index for filtering
  
  rowGroupInclude - null for all of the rows or an array with boolean for each row group in the current stripe.
  
  forceDirect - should direct buffers be created?
  
  readPhase - influences the columns that are read e.g. if readPhase = LEADERS then only the data required for FILTER columns is read
  
  Returns:
  
  the buffers that were read
  
  Throws:
  
  IOException
- readFollowData
  
  public BufferChunkList readFollowData(OrcIndex index, boolean[] rowGroupInclude, int rgIdx, boolean forceDirect) throws IOException
  
  Throws:
  
  IOException
- getWriterTimezone
  
  public String getWriterTimezone()
- getStream
  
  public InStream getStream(StreamName name) throws IOException
  
  Get the stream for the given name. It is assumed that the name does not have the encryption set, because the TreeReader's don't know if they are reading encrypted data. Assumes that readData has already been called on this stripe.
  
  Parameters:
  
  name - the column/kind of the stream
  
  Returns:
  
  a new stream with the options set correctly
  
  Throws:
  
  IOException
- clearStreams
  
  public void clearStreams()
  
  Release all of the buffers for the current stripe.
- getEncoding
  
  public OrcProto.ColumnEncoding getEncoding(int column)
- readRowIndex
  
  public OrcIndex readRowIndex(boolean[] sargColumns, OrcIndex output) throws IOException
  
  Read and parse the indexes for the current stripe.
  
  Parameters:
  
  sargColumns - the columns we can use bloom filters for
  
  output - an OrcIndex to reuse
  
  Returns:
  
  the indexes for the required columns
  
  Throws:
  
  IOException

Class StripePlanner

Nested Class Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Details

StripePlanner

StripePlanner

StripePlanner

Method Details

parseStripe

readData

readFollowData

getWriterTimezone

getStream

clearStreams

getEncoding

readRowIndex