Package org.apache.orc.impl
Class SerializationUtils
java.lang.Object
org.apache.orc.impl.SerializationUtils
-
Nested Class Summary
Nested Classes -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionlongbytesToLongBE(InStream input, int n) Read n bytes in big endian order and convert to longstatic StringbytesVectorToString(org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector vector, int elementNum) Convert a bytes vector element into a String.static longconvertBetweenTimezones(TimeZone writer, TimeZone reader, long millis) Find the relative offset when moving between timezones at a particular point in time.static doubleconvertFromUtc(TimeZone local, double time) Convert a UTC time to a local timezonestatic longconvertFromUtc(TimeZone local, long time) static longconvertToUtc(TimeZone local, long time) static intdecodeBitWidth(int n) Decodes the ordinal fixed bit value to actual fixed bit width valueintencodeBitWidth(int n) Finds the closest available fixed bit width match and returns its encoded value (ordinal).intfindClosestNumBits(long value) Count the number of bits required to encode the given valueintgetClosestAlignedFixedBits(int n) intgetClosestFixedBits(int n) For a given fixed bit this function will return the closest available fixed bitstatic StreamOptionsgetCustomizedCodec(StreamOptions base, OrcFile.CompressionStrategy strategy, OrcProto.Stream.Kind kind) Get the stream options with the compression tuned for the particular kind of stream.booleanisSafeSubtract(long left, long right) static DateparseDateFromString(String string) Parse a date from a string.intpercentileBits(long[] data, int offset, int length, double p) Compute the bits required to represent pth percentile valuestatic BigIntegerreadBigInteger(InputStream input) Read the signed arbitrary sized BigInteger BigInteger in vint formatdoublefloatreadFloat(InputStream in) voidRead bitpacked integers from input streamlongstatic longstatic longvoidskipDouble(InputStream in, int numOfDoubles) voidskipFloat(InputStream in, int numOfFloats) static voidwriteBigInteger(OutputStream output, BigInteger value) Write the arbitrarily sized signed BigInteger in vint format.voidwriteDouble(OutputStream output, double value) voidwriteFloat(OutputStream output, float value) voidwriteInts(long[] input, int offset, int len, int bitSize, OutputStream output) Bitpack and write the input values to underlying output streamvoidwriteVslong(OutputStream output, long value) voidwriteVulong(OutputStream output, long value) longzigzagDecode(long val) zigzag decode the given valuelongzigzagEncode(long val) zigzag encode the given value
-
Constructor Details
-
SerializationUtils
public SerializationUtils()
-
-
Method Details
-
writeVulong
- Throws:
IOException
-
writeVslong
- Throws:
IOException
-
readVulong
- Throws:
IOException
-
readVslong
- Throws:
IOException
-
readFloat
- Throws:
IOException
-
skipFloat
- Throws:
IOException
-
writeFloat
- Throws:
IOException
-
readDouble
- Throws:
IOException
-
readLongLE
- Throws:
IOException
-
skipDouble
- Throws:
IOException
-
writeDouble
- Throws:
IOException
-
writeBigInteger
Write the arbitrarily sized signed BigInteger in vint format. Signed integers are encoded using the low bit as the sign bit using zigzag encoding. Each byte uses the low 7 bits for data and the high bit for stop/continue. Bytes are stored LSB first.- Parameters:
output- the stream to write tovalue- the value to output- Throws:
IOException
-
readBigInteger
Read the signed arbitrary sized BigInteger BigInteger in vint format- Parameters:
input- the stream to read from- Returns:
- the read BigInteger
- Throws:
IOException
-
findClosestNumBits
public int findClosestNumBits(long value) Count the number of bits required to encode the given value- Parameters:
value-- Returns:
- bits required to store value
-
zigzagEncode
public long zigzagEncode(long val) zigzag encode the given value- Parameters:
val-- Returns:
- zigzag encoded value
-
zigzagDecode
public long zigzagDecode(long val) zigzag decode the given value- Parameters:
val-- Returns:
- zizag decoded value
-
percentileBits
public int percentileBits(long[] data, int offset, int length, double p) Compute the bits required to represent pth percentile value- Parameters:
data- - arrayp- - percentile value (>=0.0 to <=1.0)- Returns:
- pth percentile bits
-
bytesToLongBE
Read n bytes in big endian order and convert to long- Returns:
- long value
- Throws:
IOException
-
getClosestFixedBits
public int getClosestFixedBits(int n) For a given fixed bit this function will return the closest available fixed bit- Parameters:
n-- Returns:
- closest valid fixed bit
-
getClosestAlignedFixedBits
public int getClosestAlignedFixedBits(int n) -
encodeBitWidth
public int encodeBitWidth(int n) Finds the closest available fixed bit width match and returns its encoded value (ordinal).- Parameters:
n- fixed bit width to encode- Returns:
- encoded fixed bit width
-
decodeBitWidth
public static int decodeBitWidth(int n) Decodes the ordinal fixed bit value to actual fixed bit width value- Parameters:
n- - encoded fixed bit width- Returns:
- decoded fixed bit width
-
writeInts
public void writeInts(long[] input, int offset, int len, int bitSize, OutputStream output) throws IOException Bitpack and write the input values to underlying output stream- Parameters:
input- - values to writeoffset- - offsetlen- - lengthbitSize- - bit widthoutput- - output stream- Throws:
IOException
-
readInts
public void readInts(long[] buffer, int offset, int len, int bitSize, InStream input) throws IOException Read bitpacked integers from input stream- Parameters:
buffer- - input bufferoffset- - offsetlen- - lengthbitSize- - bit widthinput- - input stream- Throws:
IOException
-
isSafeSubtract
public boolean isSafeSubtract(long left, long right) -
convertFromUtc
Convert a UTC time to a local timezone- Parameters:
local- the local timezonetime- the number of seconds since 1970- Returns:
- the converted timestamp
-
convertFromUtc
-
convertToUtc
-
getCustomizedCodec
public static StreamOptions getCustomizedCodec(StreamOptions base, OrcFile.CompressionStrategy strategy, OrcProto.Stream.Kind kind) Get the stream options with the compression tuned for the particular kind of stream.- Parameters:
base- the original optionsstrategy- the compression strategykind- the stream kind- Returns:
- the tuned options or the original if it is the same
-
convertBetweenTimezones
Find the relative offset when moving between timezones at a particular point in time. This is a function of ORC v0 and v1 writing timestamps relative to the local timezone. Therefore, when we read, we need to convert from the writer's timezone to the reader's timezone.- Parameters:
writer- the timezone we are moving fromreader- the timezone we are moving tomillis- the point in time- Returns:
- the change in milliseconds
-
bytesVectorToString
public static String bytesVectorToString(org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector vector, int elementNum) Convert a bytes vector element into a String.- Parameters:
vector- the vector to useelementNum- the element number to stringify- Returns:
- a string or null if the value was null
-
parseDateFromString
Parse a date from a string.- Parameters:
string- the date to parse (YYYY-MM-DD)- Returns:
- the Date parsed, or null if there was a parse error.
-