Interface TreeWriter

All Known Implementing Classes:
BinaryTreeWriter, BooleanTreeWriter, ByteTreeWriter, CharTreeWriter, DateTreeWriter, Decimal64TreeWriter, DecimalTreeWriter, DoubleTreeWriter, EncryptionTreeWriter, FloatTreeWriter, IntegerTreeWriter, ListTreeWriter, MapTreeWriter, StringBaseTreeWriter, StringTreeWriter, StructTreeWriter, TimestampTreeWriter, TreeWriterBase, UnionTreeWriter, VarcharTreeWriter

public interface TreeWriter
The writers for the specific writers of each type. This provides the generic API that they must all implement.
  • Nested Class Summary

    Nested Classes
    Modifier and Type
    Interface
    Description
    static class 
     
  • Method Summary

    Modifier and Type
    Method
    Description
    void
    During a stripe append, we need to handle the stripe statistics.
    void
    Create a row index entry at the current point in the stripe.
    long
    Estimate the memory currently used to buffer the stripe.
    void
    Flush the TreeWriter stream
    void
    Get the current file statistics for each column.
    long
    Estimate the memory used if the file was read into Hive's Writable types.
    void
    prepareStripe(int stripeId)
    Set up for the next stripe.
    void
    writeBatch(org.apache.hadoop.hive.ql.exec.vector.ColumnVector vector, int offset, int length)
    Write a ColumnVector to the file.
    void
    Write the FileStatistics for each column in each encryption variant.
    void
    writeRootBatch(org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch batch, int offset, int length)
    Write a VectorizedRowBatch to the file.
    void
    writeStripe(int requiredIndexEntries)
    Write the stripe out to the file.
  • Method Details

    • estimateMemory

      long estimateMemory()
      Estimate the memory currently used to buffer the stripe.
      Returns:
      the number of bytes
    • getRawDataSize

      long getRawDataSize()
      Estimate the memory used if the file was read into Hive's Writable types. This is used as an estimate for the query optimizer.
      Returns:
      the number of bytes
    • prepareStripe

      void prepareStripe(int stripeId)
      Set up for the next stripe.
      Parameters:
      stripeId - the next stripe id
    • writeRootBatch

      void writeRootBatch(org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch batch, int offset, int length) throws IOException
      Write a VectorizedRowBatch to the file. This is called by the WriterImplV2 at the top level.
      Parameters:
      batch - the list of all of the columns
      offset - the first row from the batch to write
      length - the number of rows to write
      Throws:
      IOException
    • writeBatch

      void writeBatch(org.apache.hadoop.hive.ql.exec.vector.ColumnVector vector, int offset, int length) throws IOException
      Write a ColumnVector to the file. This is called recursively by writeRootBatch.
      Parameters:
      vector - the data to write
      offset - the first value offset to write.
      length - the number of values to write
      Throws:
      IOException
    • createRowIndexEntry

      void createRowIndexEntry() throws IOException
      Create a row index entry at the current point in the stripe.
      Throws:
      IOException
    • flushStreams

      void flushStreams() throws IOException
      Flush the TreeWriter stream
      Throws:
      IOException
    • writeStripe

      void writeStripe(int requiredIndexEntries) throws IOException
      Write the stripe out to the file.
      Parameters:
      requiredIndexEntries - the number of index entries that are required. this is to check to make sure the row index is well formed.
      Throws:
      IOException
    • addStripeStatistics

      void addStripeStatistics(StripeStatistics[] stripeStatistics) throws IOException
      During a stripe append, we need to handle the stripe statistics.
      Parameters:
      stripeStatistics - the statistics for the new stripe across the encryption variants
      Throws:
      IOException
    • writeFileStatistics

      void writeFileStatistics() throws IOException
      Write the FileStatistics for each column in each encryption variant.
      Throws:
      IOException
    • getCurrentStatistics

      void getCurrentStatistics(ColumnStatistics[] output)
      Get the current file statistics for each column. If a column is encrypted, the encrypted variant statistics are used.
      Parameters:
      output - an array that is filled in with the results