java.lang.Object

org.apache.orc.impl.writer.TreeWriterBase

org.apache.orc.impl.writer.StructTreeWriter

All Implemented Interfaces:: TreeWriter

public class StructTreeWriter extends TreeWriterBase

Nested Class Summary

Nested classes/interfaces inherited from interface org.apache.orc.impl.writer.TreeWriter
TreeWriter.Factory
Field Summary

Fields inherited from class org.apache.orc.impl.writer.TreeWriterBase
bloomFilter, bloomFilterEntry, bloomFilterUtf8, context, createBloomFilter, encryption, fileStatistics, id, indexStatistics, isPresent, rowIndexPosition, schema, stripeColStatistics
Constructor Summary

Constructors

Constructor

Description

StructTreeWriter(TypeDescription schema, WriterEncryptionVariant encryption, WriterContext context)
Method Summary

Modifier and Type

Method

Description

void

addStripeStatistics(StripeStatistics[] stats)

During a stripe append, we need to handle the stripe statistics.

void

createRowIndexEntry()

Create a row index entry with the previous location and the current index statistics.

long

estimateMemory()

Estimate how much memory the writer is consuming excluding the streams.

void

flushStreams()

Flush the TreeWriter stream

void

getCurrentStatistics(ColumnStatistics[] output)

Get the current file statistics for each column.

long

getRawDataSize()

Estimate the memory used if the file was read into Hive's Writable types.

void

prepareStripe(int stripeId)

Set up for the next stripe.

void

writeBatch(org.apache.hadoop.hive.ql.exec.vector.ColumnVector vector, int offset, int length)

Write the values from the given vector from offset for length elements.

void

writeFileStatistics()

Write the FileStatistics for each column in each encryption variant.

void

writeRootBatch(org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch batch, int offset, int length)

Handle the top level object write.

void

writeStripe(int requiredIndexEntries)

Write the stripe out to the file.

Methods inherited from class org.apache.orc.impl.writer.TreeWriterBase
getRowIndex, getRowIndexEntry, getStripeStatistics

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Details
- StructTreeWriter
  
  public StructTreeWriter(TypeDescription schema, WriterEncryptionVariant encryption, WriterContext context) throws IOException
  
  Throws:
  
  IOException
Method Details
- writeRootBatch
  
  public void writeRootBatch(org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch batch, int offset, int length) throws IOException
  
  Description copied from class: TreeWriterBase
  
  Handle the top level object write. This default method is used for all types except structs, which are the typical case. VectorizedRowBatch assumes the top level object is a struct, so we use the first column for all other types.
  
  Specified by:
  
  writeRootBatch in interface TreeWriter
  
  Overrides:
  
  writeRootBatch in class TreeWriterBase
  
  Parameters:
  
  batch - the batch to write from
  
  offset - the row to start on
  
  length - the number of rows to write
  
  Throws:
  
  IOException
- writeBatch
  
  public void writeBatch(org.apache.hadoop.hive.ql.exec.vector.ColumnVector vector, int offset, int length) throws IOException
  
  Description copied from class: TreeWriterBase
  
  Write the values from the given vector from offset for length elements.
  
  Specified by:
  
  writeBatch in interface TreeWriter
  
  Overrides:
  
  writeBatch in class TreeWriterBase
  
  Parameters:
  
  vector - the vector to write from
  
  offset - the first value from the vector to write
  
  length - the number of values from the vector to write
  
  Throws:
  
  IOException
- createRowIndexEntry
  
  public void createRowIndexEntry() throws IOException
  
  Description copied from class: TreeWriterBase
  
  Create a row index entry with the previous location and the current index statistics. Also merges the index statistics into the file statistics before they are cleared. Finally, it records the start of the next index and ensures all of the children columns also create an entry.
  
  Specified by:
  
  createRowIndexEntry in interface TreeWriter
  
  Overrides:
  
  createRowIndexEntry in class TreeWriterBase
  
  Throws:
  
  IOException
- writeStripe
  
  public void writeStripe(int requiredIndexEntries) throws IOException
  
  Description copied from interface: TreeWriter
  
  Write the stripe out to the file.
  
  Specified by:
  
  writeStripe in interface TreeWriter
  
  Overrides:
  
  writeStripe in class TreeWriterBase
  
  Parameters:
  
  requiredIndexEntries - the number of index entries that are required. this is to check to make sure the row index is well formed.
  
  Throws:
  
  IOException
- addStripeStatistics
  
  public void addStripeStatistics(StripeStatistics[] stats) throws IOException
  
  Description copied from interface: TreeWriter
  
  During a stripe append, we need to handle the stripe statistics.
  
  Specified by:
  
  addStripeStatistics in interface TreeWriter
  
  Overrides:
  
  addStripeStatistics in class TreeWriterBase
  
  Parameters:
  
  stats - the statistics for the new stripe across the encryption variants
  
  Throws:
  
  IOException
- estimateMemory
  
  public long estimateMemory()
  
  Description copied from class: TreeWriterBase
  
  Estimate how much memory the writer is consuming excluding the streams.
  
  Specified by:
  
  estimateMemory in interface TreeWriter
  
  Overrides:
  
  estimateMemory in class TreeWriterBase
  
  Returns:
  
  the number of bytes.
- getRawDataSize
  
  public long getRawDataSize()
  
  Description copied from interface: TreeWriter
  
  Estimate the memory used if the file was read into Hive's Writable types. This is used as an estimate for the query optimizer.
  
  Returns:
  
  the number of bytes
- writeFileStatistics
  
  public void writeFileStatistics() throws IOException
  
  Description copied from interface: TreeWriter
  
  Write the FileStatistics for each column in each encryption variant.
  
  Specified by:
  
  writeFileStatistics in interface TreeWriter
  
  Overrides:
  
  writeFileStatistics in class TreeWriterBase
  
  Throws:
  
  IOException
- flushStreams
  
  public void flushStreams() throws IOException
  
  Description copied from interface: TreeWriter
  
  Flush the TreeWriter stream
  
  Specified by:
  
  flushStreams in interface TreeWriter
  
  Overrides:
  
  flushStreams in class TreeWriterBase
  
  Throws:
  
  IOException
- getCurrentStatistics
  
  public void getCurrentStatistics(ColumnStatistics[] output)
  
  Description copied from interface: TreeWriter
  
  Get the current file statistics for each column. If a column is encrypted, the encrypted variant statistics are used.
  
  Specified by:
  
  getCurrentStatistics in interface TreeWriter
  
  Overrides:
  
  getCurrentStatistics in class TreeWriterBase
  
  Parameters:
  
  output - an array that is filled in with the results
- prepareStripe
  
  public void prepareStripe(int stripeId)
  
  Description copied from interface: TreeWriter
  
  Set up for the next stripe.
  
  Specified by:
  
  prepareStripe in interface TreeWriter
  
  Overrides:
  
  prepareStripe in class TreeWriterBase
  
  Parameters:
  
  stripeId - the next stripe id

Class StructTreeWriter

Nested Class Summary

Nested classes/interfaces inherited from interface org.apache.orc.impl.writer.TreeWriter

Field Summary

Fields inherited from class org.apache.orc.impl.writer.TreeWriterBase

Constructor Summary

Method Summary

Methods inherited from class org.apache.orc.impl.writer.TreeWriterBase

Methods inherited from class java.lang.Object

Constructor Details

StructTreeWriter

Method Details

writeRootBatch

writeBatch

createRowIndexEntry

writeStripe

addStripeStatistics

estimateMemory

getRawDataSize

writeFileStatistics

flushStreams

getCurrentStatistics

prepareStripe