Package org.apache.orc.impl.writer
Class TreeWriterBase
java.lang.Object
org.apache.orc.impl.writer.TreeWriterBase
- All Implemented Interfaces:
- TreeWriter
- Direct Known Subclasses:
- BinaryTreeWriter,- BooleanTreeWriter,- ByteTreeWriter,- DateTreeWriter,- Decimal64TreeWriter,- DecimalTreeWriter,- DoubleTreeWriter,- FloatTreeWriter,- GeospatialTreeWriter,- IntegerTreeWriter,- ListTreeWriter,- MapTreeWriter,- StringBaseTreeWriter,- StructTreeWriter,- TimestampTreeWriter,- UnionTreeWriter
The parent class of all of the writers for each column. Each column
 is written by an instance of this class. The compound types (struct,
 list, map, and union) have children tree writers that write the children
 types.
- 
Nested Class SummaryNested classes/interfaces inherited from interface org.apache.orc.impl.writer.TreeWriterTreeWriter.Factory
- 
Field SummaryFieldsModifier and TypeFieldDescriptionprotected final BloomFilterprotected final OrcProto.BloomFilter.Builderprotected final BloomFilterUtf8protected final WriterContextprotected final booleanprotected final WriterEncryptionVariantprotected final ColumnStatisticsImplprotected final intprotected final ColumnStatisticsImplprotected final BitFieldWriterprotected final TypeDescriptionprotected final ColumnStatisticsImpl
- 
Method SummaryModifier and TypeMethodDescriptionvoidaddStripeStatistics(StripeStatistics[] stats) During a stripe append, we need to handle the stripe statistics.voidCreate a row index entry with the previous location and the current index statistics.longEstimate how much memory the writer is consuming excluding the streams.voidFlush the TreeWriter streamvoidgetCurrentStatistics(ColumnStatistics[] output) Get the current file statistics for each column.protected OrcProto.RowIndex.Builderprotected OrcProto.RowIndexEntry.Builderprotected ColumnStatisticsImplvoidprepareStripe(int stripeId) Set up for the next stripe.voidwriteBatch(org.apache.hadoop.hive.ql.exec.vector.ColumnVector vector, int offset, int length) Write the values from the given vector from offset for length elements.voidWrite the FileStatistics for each column in each encryption variant.voidwriteRootBatch(org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch batch, int offset, int length) Handle the top level object write.voidwriteStripe(int requiredIndexEntries) Write the stripe out to the file.Methods inherited from class java.lang.Objectclone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface org.apache.orc.impl.writer.TreeWritergetRawDataSize
- 
Field Details- 
idprotected final int id
- 
isPresent
- 
schema
- 
encryption
- 
indexStatistics
- 
stripeColStatistics
- 
fileStatistics
- 
rowIndexPosition
- 
bloomFilter
- 
bloomFilterUtf8
- 
createBloomFilterprotected final boolean createBloomFilter
- 
bloomFilterEntry
- 
context
 
- 
- 
Method Details- 
getRowIndex
- 
getStripeStatistics
- 
getRowIndexEntry
- 
writeRootBatchpublic void writeRootBatch(org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch batch, int offset, int length) throws IOException Handle the top level object write. This default method is used for all types except structs, which are the typical case. VectorizedRowBatch assumes the top level object is a struct, so we use the first column for all other types.- Specified by:
- writeRootBatchin interface- TreeWriter
- Parameters:
- batch- the batch to write from
- offset- the row to start on
- length- the number of rows to write
- Throws:
- IOException
 
- 
writeBatchpublic void writeBatch(org.apache.hadoop.hive.ql.exec.vector.ColumnVector vector, int offset, int length) throws IOException Write the values from the given vector from offset for length elements.- Specified by:
- writeBatchin interface- TreeWriter
- Parameters:
- vector- the vector to write from
- offset- the first value from the vector to write
- length- the number of values from the vector to write
- Throws:
- IOException
 
- 
prepareStripepublic void prepareStripe(int stripeId) Description copied from interface:TreeWriterSet up for the next stripe.- Specified by:
- prepareStripein interface- TreeWriter
- Parameters:
- stripeId- the next stripe id
 
- 
flushStreamsDescription copied from interface:TreeWriterFlush the TreeWriter stream- Specified by:
- flushStreamsin interface- TreeWriter
- Throws:
- IOException
 
- 
writeStripeDescription copied from interface:TreeWriterWrite the stripe out to the file.- Specified by:
- writeStripein interface- TreeWriter
- Parameters:
- requiredIndexEntries- the number of index entries that are required. this is to check to make sure the row index is well formed.
- Throws:
- IOException
 
- 
createRowIndexEntryCreate a row index entry with the previous location and the current index statistics. Also merges the index statistics into the file statistics before they are cleared. Finally, it records the start of the next index and ensures all of the children columns also create an entry.- Specified by:
- createRowIndexEntryin interface- TreeWriter
- Throws:
- IOException
 
- 
addStripeStatisticsDescription copied from interface:TreeWriterDuring a stripe append, we need to handle the stripe statistics.- Specified by:
- addStripeStatisticsin interface- TreeWriter
- Parameters:
- stats- the statistics for the new stripe across the encryption variants
- Throws:
- IOException
 
- 
estimateMemorypublic long estimateMemory()Estimate how much memory the writer is consuming excluding the streams.- Specified by:
- estimateMemoryin interface- TreeWriter
- Returns:
- the number of bytes.
 
- 
writeFileStatisticsDescription copied from interface:TreeWriterWrite the FileStatistics for each column in each encryption variant.- Specified by:
- writeFileStatisticsin interface- TreeWriter
- Throws:
- IOException
 
- 
getCurrentStatisticsDescription copied from interface:TreeWriterGet the current file statistics for each column. If a column is encrypted, the encrypted variant statistics are used.- Specified by:
- getCurrentStatisticsin interface- TreeWriter
- Parameters:
- output- an array that is filled in with the results
 
 
-