Package org.apache.orc.impl.writer
Class StructTreeWriter
java.lang.Object
org.apache.orc.impl.writer.TreeWriterBase
org.apache.orc.impl.writer.StructTreeWriter
- All Implemented Interfaces:
TreeWriter
-
Nested Class Summary
Nested classes/interfaces inherited from interface org.apache.orc.impl.writer.TreeWriter
TreeWriter.Factory -
Field Summary
Fields inherited from class org.apache.orc.impl.writer.TreeWriterBase
bloomFilter, bloomFilterEntry, bloomFilterUtf8, context, createBloomFilter, encryption, fileStatistics, id, indexStatistics, isPresent, rowIndexPosition, schema, stripeColStatistics -
Constructor Summary
ConstructorsConstructorDescriptionStructTreeWriter(TypeDescription schema, WriterEncryptionVariant encryption, WriterContext context) -
Method Summary
Modifier and TypeMethodDescriptionvoidaddStripeStatistics(StripeStatistics[] stats) During a stripe append, we need to handle the stripe statistics.voidCreate a row index entry with the previous location and the current index statistics.longEstimate how much memory the writer is consuming excluding the streams.voidFlush the TreeWriter streamvoidgetCurrentStatistics(ColumnStatistics[] output) Get the current file statistics for each column.longEstimate the memory used if the file was read into Hive's Writable types.voidprepareStripe(int stripeId) Set up for the next stripe.voidwriteBatch(org.apache.hadoop.hive.ql.exec.vector.ColumnVector vector, int offset, int length) Write the values from the given vector from offset for length elements.voidWrite the FileStatistics for each column in each encryption variant.voidwriteRootBatch(org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch batch, int offset, int length) Handle the top level object write.voidwriteStripe(int requiredIndexEntries) Write the stripe out to the file.Methods inherited from class org.apache.orc.impl.writer.TreeWriterBase
getRowIndex, getRowIndexEntry, getStripeStatistics
-
Constructor Details
-
StructTreeWriter
public StructTreeWriter(TypeDescription schema, WriterEncryptionVariant encryption, WriterContext context) throws IOException - Throws:
IOException
-
-
Method Details
-
writeRootBatch
public void writeRootBatch(org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch batch, int offset, int length) throws IOException Description copied from class:TreeWriterBaseHandle the top level object write. This default method is used for all types except structs, which are the typical case. VectorizedRowBatch assumes the top level object is a struct, so we use the first column for all other types.- Specified by:
writeRootBatchin interfaceTreeWriter- Overrides:
writeRootBatchin classTreeWriterBase- Parameters:
batch- the batch to write fromoffset- the row to start onlength- the number of rows to write- Throws:
IOException
-
writeBatch
public void writeBatch(org.apache.hadoop.hive.ql.exec.vector.ColumnVector vector, int offset, int length) throws IOException Description copied from class:TreeWriterBaseWrite the values from the given vector from offset for length elements.- Specified by:
writeBatchin interfaceTreeWriter- Overrides:
writeBatchin classTreeWriterBase- Parameters:
vector- the vector to write fromoffset- the first value from the vector to writelength- the number of values from the vector to write- Throws:
IOException
-
createRowIndexEntry
Description copied from class:TreeWriterBaseCreate a row index entry with the previous location and the current index statistics. Also merges the index statistics into the file statistics before they are cleared. Finally, it records the start of the next index and ensures all of the children columns also create an entry.- Specified by:
createRowIndexEntryin interfaceTreeWriter- Overrides:
createRowIndexEntryin classTreeWriterBase- Throws:
IOException
-
writeStripe
Description copied from interface:TreeWriterWrite the stripe out to the file.- Specified by:
writeStripein interfaceTreeWriter- Overrides:
writeStripein classTreeWriterBase- Parameters:
requiredIndexEntries- the number of index entries that are required. this is to check to make sure the row index is well formed.- Throws:
IOException
-
addStripeStatistics
Description copied from interface:TreeWriterDuring a stripe append, we need to handle the stripe statistics.- Specified by:
addStripeStatisticsin interfaceTreeWriter- Overrides:
addStripeStatisticsin classTreeWriterBase- Parameters:
stats- the statistics for the new stripe across the encryption variants- Throws:
IOException
-
estimateMemory
public long estimateMemory()Description copied from class:TreeWriterBaseEstimate how much memory the writer is consuming excluding the streams.- Specified by:
estimateMemoryin interfaceTreeWriter- Overrides:
estimateMemoryin classTreeWriterBase- Returns:
- the number of bytes.
-
getRawDataSize
public long getRawDataSize()Description copied from interface:TreeWriterEstimate the memory used if the file was read into Hive's Writable types. This is used as an estimate for the query optimizer.- Returns:
- the number of bytes
-
writeFileStatistics
Description copied from interface:TreeWriterWrite the FileStatistics for each column in each encryption variant.- Specified by:
writeFileStatisticsin interfaceTreeWriter- Overrides:
writeFileStatisticsin classTreeWriterBase- Throws:
IOException
-
flushStreams
Description copied from interface:TreeWriterFlush the TreeWriter stream- Specified by:
flushStreamsin interfaceTreeWriter- Overrides:
flushStreamsin classTreeWriterBase- Throws:
IOException
-
getCurrentStatistics
Description copied from interface:TreeWriterGet the current file statistics for each column. If a column is encrypted, the encrypted variant statistics are used.- Specified by:
getCurrentStatisticsin interfaceTreeWriter- Overrides:
getCurrentStatisticsin classTreeWriterBase- Parameters:
output- an array that is filled in with the results
-
prepareStripe
public void prepareStripe(int stripeId) Description copied from interface:TreeWriterSet up for the next stripe.- Specified by:
prepareStripein interfaceTreeWriter- Overrides:
prepareStripein classTreeWriterBase- Parameters:
stripeId- the next stripe id
-