Class TreeWriterBase

java.lang.Object
org.apache.orc.impl.writer.TreeWriterBase
All Implemented Interfaces:
TreeWriter
Direct Known Subclasses:
BinaryTreeWriter, BooleanTreeWriter, ByteTreeWriter, DateTreeWriter, Decimal64TreeWriter, DecimalTreeWriter, DoubleTreeWriter, FloatTreeWriter, IntegerTreeWriter, ListTreeWriter, MapTreeWriter, StringBaseTreeWriter, StructTreeWriter, TimestampTreeWriter, UnionTreeWriter

public abstract class TreeWriterBase extends Object implements TreeWriter
The parent class of all of the writers for each column. Each column is written by an instance of this class. The compound types (struct, list, map, and union) have children tree writers that write the children types.
  • Field Details

  • Method Details

    • getRowIndex

      protected OrcProto.RowIndex.Builder getRowIndex()
    • getStripeStatistics

      protected ColumnStatisticsImpl getStripeStatistics()
    • getRowIndexEntry

      protected OrcProto.RowIndexEntry.Builder getRowIndexEntry()
    • writeRootBatch

      public void writeRootBatch(org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch batch, int offset, int length) throws IOException
      Handle the top level object write. This default method is used for all types except structs, which are the typical case. VectorizedRowBatch assumes the top level object is a struct, so we use the first column for all other types.
      Specified by:
      writeRootBatch in interface TreeWriter
      Parameters:
      batch - the batch to write from
      offset - the row to start on
      length - the number of rows to write
      Throws:
      IOException
    • writeBatch

      public void writeBatch(org.apache.hadoop.hive.ql.exec.vector.ColumnVector vector, int offset, int length) throws IOException
      Write the values from the given vector from offset for length elements.
      Specified by:
      writeBatch in interface TreeWriter
      Parameters:
      vector - the vector to write from
      offset - the first value from the vector to write
      length - the number of values from the vector to write
      Throws:
      IOException
    • prepareStripe

      public void prepareStripe(int stripeId)
      Description copied from interface: TreeWriter
      Set up for the next stripe.
      Specified by:
      prepareStripe in interface TreeWriter
      Parameters:
      stripeId - the next stripe id
    • flushStreams

      public void flushStreams() throws IOException
      Description copied from interface: TreeWriter
      Flush the TreeWriter stream
      Specified by:
      flushStreams in interface TreeWriter
      Throws:
      IOException
    • writeStripe

      public void writeStripe(int requiredIndexEntries) throws IOException
      Description copied from interface: TreeWriter
      Write the stripe out to the file.
      Specified by:
      writeStripe in interface TreeWriter
      Parameters:
      requiredIndexEntries - the number of index entries that are required. this is to check to make sure the row index is well formed.
      Throws:
      IOException
    • createRowIndexEntry

      public void createRowIndexEntry() throws IOException
      Create a row index entry with the previous location and the current index statistics. Also merges the index statistics into the file statistics before they are cleared. Finally, it records the start of the next index and ensures all of the children columns also create an entry.
      Specified by:
      createRowIndexEntry in interface TreeWriter
      Throws:
      IOException
    • addStripeStatistics

      public void addStripeStatistics(StripeStatistics[] stats) throws IOException
      Description copied from interface: TreeWriter
      During a stripe append, we need to handle the stripe statistics.
      Specified by:
      addStripeStatistics in interface TreeWriter
      Parameters:
      stats - the statistics for the new stripe across the encryption variants
      Throws:
      IOException
    • estimateMemory

      public long estimateMemory()
      Estimate how much memory the writer is consuming excluding the streams.
      Specified by:
      estimateMemory in interface TreeWriter
      Returns:
      the number of bytes.
    • writeFileStatistics

      public void writeFileStatistics() throws IOException
      Description copied from interface: TreeWriter
      Write the FileStatistics for each column in each encryption variant.
      Specified by:
      writeFileStatistics in interface TreeWriter
      Throws:
      IOException
    • getCurrentStatistics

      public void getCurrentStatistics(ColumnStatistics[] output)
      Description copied from interface: TreeWriter
      Get the current file statistics for each column. If a column is encrypted, the encrypted variant statistics are used.
      Specified by:
      getCurrentStatistics in interface TreeWriter
      Parameters:
      output - an array that is filled in with the results