Class InStream

java.lang.Object
java.io.InputStream
org.apache.orc.impl.InStream
All Implemented Interfaces:
Closeable, AutoCloseable
Direct Known Subclasses:
InStream.CompressedStream, InStream.UncompressedStream

public abstract class InStream extends InputStream
  • Field Details

    • PROTOBUF_MESSAGE_MAX_LIMIT

      public static final int PROTOBUF_MESSAGE_MAX_LIMIT
      See Also:
    • name

      protected final Object name
    • offset

      protected final long offset
    • length

      protected long length
    • bytes

      protected org.apache.hadoop.hive.common.io.DiskRangeList bytes
    • position

      protected long position
  • Constructor Details

    • InStream

      public InStream(Object name, long offset, long length)
  • Method Details

    • toString

      public String toString()
      Overrides:
      toString in class Object
    • close

      public abstract void close()
      Specified by:
      close in interface AutoCloseable
      Specified by:
      close in interface Closeable
      Overrides:
      close in class InputStream
    • setCurrent

      protected abstract void setCurrent(org.apache.hadoop.hive.common.io.DiskRangeList newRange, boolean isJump)
      Set the current range
      Parameters:
      newRange - the block that is current
      isJump - if this was a seek instead of a natural read
    • reset

      protected void reset(org.apache.hadoop.hive.common.io.DiskRangeList input)
      Reset the input to a new set of data.
      Parameters:
      input - the input data
    • reset

      protected void reset(org.apache.hadoop.hive.common.io.DiskRangeList input, long length)
      Reset the input to a new set of data with a different length. in some cases, after resetting an UncompressedStream, its actual length is longer than its initial length. Prior to ORC-516, InStream.UncompressedStream class had the 'length' field and the length was modifiable in the reset() method. It was used in SettableUncompressedStream class in setBuffers() method. SettableUncompressedStream was passing 'diskRangeInfo.getTotalLength()' as the length to the reset() method. SettableUncompressedStream had been removed from ORC code base, but it is required for Apache Hive and Apache Hive manages its own copy of SettableUncompressedStream since upgrading its Apache ORC version to 1.6.7. ORC-516 was the root cause of the regression reported in HIVE-27128 - EOFException when reading DATA stream. This wrapper method allows to resolve HIVE-27128.
      Parameters:
      input - the input data
      length - new length of the stream
    • changeIv

      public abstract void changeIv(Consumer<byte[]> modifier)
    • seek

      public abstract void seek(PositionProvider index) throws IOException
      Throws:
      IOException
    • options

      public static InStream.StreamOptions options()
    • create

      public static InStream create(Object name, org.apache.hadoop.hive.common.io.DiskRangeList input, long offset, long length, InStream.StreamOptions options)
      Create an input stream from a list of disk ranges with data.
      Parameters:
      name - the name of the stream
      input - the list of ranges of bytes for the stream; from disk or cache
      offset - the first byte offset of the stream
      length - the length in bytes of the stream
      options - the options to read with
      Returns:
      an input stream
    • create

      public static InStream create(Object name, org.apache.hadoop.hive.common.io.DiskRangeList input, long offset, long length)
      Create an input stream from a list of disk ranges with data.
      Parameters:
      name - the name of the stream
      input - the list of ranges of bytes for the stream; from disk or cache
      length - the length in bytes of the stream
      Returns:
      an input stream
    • createCodedInputStream

      public static com.google.protobuf.CodedInputStream createCodedInputStream(InStream inStream)
      Creates coded input stream (used for protobuf message parsing) with higher message size limit.
      Parameters:
      inStream - the stream to wrap.
      Returns:
      coded input stream