Package org.apache.orc.impl
Class SerializationUtils
java.lang.Object
org.apache.orc.impl.SerializationUtils
-
Nested Class Summary
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionlong
bytesToLongBE
(InStream input, int n) Read n bytes in big endian order and convert to longstatic String
bytesVectorToString
(org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector vector, int elementNum) Convert a bytes vector element into a String.static long
convertBetweenTimezones
(TimeZone writer, TimeZone reader, long millis) Find the relative offset when moving between timezones at a particular point in time.static double
convertFromUtc
(TimeZone local, double time) Convert a UTC time to a local timezonestatic long
convertFromUtc
(TimeZone local, long time) static long
convertToUtc
(TimeZone local, long time) static int
decodeBitWidth
(int n) Decodes the ordinal fixed bit value to actual fixed bit width valueint
encodeBitWidth
(int n) Finds the closest available fixed bit width match and returns its encoded value (ordinal).int
findClosestNumBits
(long value) Count the number of bits required to encode the given valueint
getClosestAlignedFixedBits
(int n) int
getClosestFixedBits
(int n) For a given fixed bit this function will return the closest available fixed bitstatic StreamOptions
getCustomizedCodec
(StreamOptions base, OrcFile.CompressionStrategy strategy, OrcProto.Stream.Kind kind) Get the stream options with the compression tuned for the particular kind of stream.boolean
isSafeSubtract
(long left, long right) static Date
parseDateFromString
(String string) Parse a date from a string.int
percentileBits
(long[] data, int offset, int length, double p) Compute the bits required to represent pth percentile valuestatic BigInteger
readBigInteger
(InputStream input) Read the signed arbitrary sized BigInteger BigInteger in vint formatdouble
float
readFloat
(InputStream in) void
Read bitpacked integers from input streamlong
static long
static long
void
skipDouble
(InputStream in, int numOfDoubles) void
skipFloat
(InputStream in, int numOfFloats) static void
writeBigInteger
(OutputStream output, BigInteger value) Write the arbitrarily sized signed BigInteger in vint format.void
writeDouble
(OutputStream output, double value) void
writeFloat
(OutputStream output, float value) void
writeInts
(long[] input, int offset, int len, int bitSize, OutputStream output) Bitpack and write the input values to underlying output streamvoid
writeVslong
(OutputStream output, long value) void
writeVulong
(OutputStream output, long value) long
zigzagDecode
(long val) zigzag decode the given valuelong
zigzagEncode
(long val) zigzag encode the given value
-
Constructor Details
-
SerializationUtils
public SerializationUtils()
-
-
Method Details
-
writeVulong
- Throws:
IOException
-
writeVslong
- Throws:
IOException
-
readVulong
- Throws:
IOException
-
readVslong
- Throws:
IOException
-
readFloat
- Throws:
IOException
-
skipFloat
- Throws:
IOException
-
writeFloat
- Throws:
IOException
-
readDouble
- Throws:
IOException
-
readLongLE
- Throws:
IOException
-
skipDouble
- Throws:
IOException
-
writeDouble
- Throws:
IOException
-
writeBigInteger
Write the arbitrarily sized signed BigInteger in vint format. Signed integers are encoded using the low bit as the sign bit using zigzag encoding. Each byte uses the low 7 bits for data and the high bit for stop/continue. Bytes are stored LSB first.- Parameters:
output
- the stream to write tovalue
- the value to output- Throws:
IOException
-
readBigInteger
Read the signed arbitrary sized BigInteger BigInteger in vint format- Parameters:
input
- the stream to read from- Returns:
- the read BigInteger
- Throws:
IOException
-
findClosestNumBits
public int findClosestNumBits(long value) Count the number of bits required to encode the given value- Parameters:
value
-- Returns:
- bits required to store value
-
zigzagEncode
public long zigzagEncode(long val) zigzag encode the given value- Parameters:
val
-- Returns:
- zigzag encoded value
-
zigzagDecode
public long zigzagDecode(long val) zigzag decode the given value- Parameters:
val
-- Returns:
- zizag decoded value
-
percentileBits
public int percentileBits(long[] data, int offset, int length, double p) Compute the bits required to represent pth percentile value- Parameters:
data
- - arrayp
- - percentile value (>=0.0 to <=1.0)- Returns:
- pth percentile bits
-
bytesToLongBE
Read n bytes in big endian order and convert to long- Returns:
- long value
- Throws:
IOException
-
getClosestFixedBits
public int getClosestFixedBits(int n) For a given fixed bit this function will return the closest available fixed bit- Parameters:
n
-- Returns:
- closest valid fixed bit
-
getClosestAlignedFixedBits
public int getClosestAlignedFixedBits(int n) -
encodeBitWidth
public int encodeBitWidth(int n) Finds the closest available fixed bit width match and returns its encoded value (ordinal).- Parameters:
n
- fixed bit width to encode- Returns:
- encoded fixed bit width
-
decodeBitWidth
public static int decodeBitWidth(int n) Decodes the ordinal fixed bit value to actual fixed bit width value- Parameters:
n
- - encoded fixed bit width- Returns:
- decoded fixed bit width
-
writeInts
public void writeInts(long[] input, int offset, int len, int bitSize, OutputStream output) throws IOException Bitpack and write the input values to underlying output stream- Parameters:
input
- - values to writeoffset
- - offsetlen
- - lengthbitSize
- - bit widthoutput
- - output stream- Throws:
IOException
-
readInts
public void readInts(long[] buffer, int offset, int len, int bitSize, InStream input) throws IOException Read bitpacked integers from input stream- Parameters:
buffer
- - input bufferoffset
- - offsetlen
- - lengthbitSize
- - bit widthinput
- - input stream- Throws:
IOException
-
isSafeSubtract
public boolean isSafeSubtract(long left, long right) -
convertFromUtc
Convert a UTC time to a local timezone- Parameters:
local
- the local timezonetime
- the number of seconds since 1970- Returns:
- the converted timestamp
-
convertFromUtc
-
convertToUtc
-
getCustomizedCodec
public static StreamOptions getCustomizedCodec(StreamOptions base, OrcFile.CompressionStrategy strategy, OrcProto.Stream.Kind kind) Get the stream options with the compression tuned for the particular kind of stream.- Parameters:
base
- the original optionsstrategy
- the compression strategykind
- the stream kind- Returns:
- the tuned options or the original if it is the same
-
convertBetweenTimezones
Find the relative offset when moving between timezones at a particular point in time. This is a function of ORC v0 and v1 writing timestamps relative to the local timezone. Therefore, when we read, we need to convert from the writer's timezone to the reader's timezone.- Parameters:
writer
- the timezone we are moving fromreader
- the timezone we are moving tomillis
- the point in time- Returns:
- the change in milliseconds
-
bytesVectorToString
public static String bytesVectorToString(org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector vector, int elementNum) Convert a bytes vector element into a String.- Parameters:
vector
- the vector to useelementNum
- the element number to stringify- Returns:
- a string or null if the value was null
-
parseDateFromString
Parse a date from a string.- Parameters:
string
- the date to parse (YYYY-MM-DD)- Returns:
- the Date parsed, or null if there was a parse error.
-