Class TypeDescription

java.lang.Object
org.apache.orc.TypeDescription
All Implemented Interfaces:
Serializable, Cloneable, Comparable<TypeDescription>

public class TypeDescription extends Object implements Comparable<TypeDescription>, Serializable, Cloneable
This is the description of the types in an ORC file.
See Also:
  • Field Details

  • Constructor Details

  • Method Details

    • compareTo

      public int compareTo(TypeDescription other)
      Specified by:
      compareTo in interface Comparable<TypeDescription>
    • createBoolean

      public static TypeDescription createBoolean()
    • createByte

      public static TypeDescription createByte()
    • createShort

      public static TypeDescription createShort()
    • createInt

      public static TypeDescription createInt()
    • createLong

      public static TypeDescription createLong()
    • createFloat

      public static TypeDescription createFloat()
    • createDouble

      public static TypeDescription createDouble()
    • createString

      public static TypeDescription createString()
    • createDate

      public static TypeDescription createDate()
    • createTimestamp

      public static TypeDescription createTimestamp()
    • createTimestampInstant

      public static TypeDescription createTimestampInstant()
    • createBinary

      public static TypeDescription createBinary()
    • createDecimal

      public static TypeDescription createDecimal()
    • fromString

      public static TypeDescription fromString(String typeName)
      Parse TypeDescription from the Hive type names. This is the inverse of TypeDescription.toString()
      Parameters:
      typeName - the name of the type
      Returns:
      a new TypeDescription or null if typeName was null
      Throws:
      IllegalArgumentException - if the string is badly formed
    • withPrecision

      public TypeDescription withPrecision(int precision)
      For decimal types, set the precision.
      Parameters:
      precision - the new precision
      Returns:
      this
    • withScale

      public TypeDescription withScale(int scale)
      For decimal types, set the scale.
      Parameters:
      scale - the new scale
      Returns:
      this
    • setAttribute

      public TypeDescription setAttribute(@NotNull @NotNull String key, String value)
      Set an attribute on this type.
      Parameters:
      key - the attribute name
      value - the attribute value or null to clear the value
      Returns:
      this for method chaining
    • removeAttribute

      public TypeDescription removeAttribute(@NotNull @NotNull String key)
      Remove attribute on this type, if it is set.
      Parameters:
      key - the attribute name
      Returns:
      this for method chaining
    • createVarchar

      public static TypeDescription createVarchar()
    • createChar

      public static TypeDescription createChar()
    • withMaxLength

      public TypeDescription withMaxLength(int maxLength)
      Set the maximum length for char and varchar types.
      Parameters:
      maxLength - the maximum value
      Returns:
      this
    • createList

      public static TypeDescription createList(TypeDescription childType)
    • createMap

      public static TypeDescription createMap(TypeDescription keyType, TypeDescription valueType)
    • createUnion

      public static TypeDescription createUnion()
    • createStruct

      public static TypeDescription createStruct()
    • addUnionChild

      public TypeDescription addUnionChild(TypeDescription child)
      Add a child to a union type.
      Parameters:
      child - a new child type to add
      Returns:
      the union type.
    • addField

      public TypeDescription addField(String field, TypeDescription fieldType)
      Add a field to a struct type as it is built.
      Parameters:
      field - the field name
      fieldType - the type of the field
      Returns:
      the struct type
    • getId

      public int getId()
      Get the id for this type. The first call will cause all of the the ids in tree to be assigned, so it should not be called before the type is completely built.
      Returns:
      the sequential id
    • clone

      public TypeDescription clone()
      Overrides:
      clone in class Object
    • hashCode

      public int hashCode()
      Overrides:
      hashCode in class Object
    • equals

      public boolean equals(Object other)
      Overrides:
      equals in class Object
    • equals

      public boolean equals(Object other, boolean checkAttributes)
      Determines whether the two object are equal. This function can either compare or ignore the type attributes as desired.
      Parameters:
      other - the reference object with which to compare.
      checkAttributes - should the type attributes be considered?
      Returns:
      true if this object is the same as the other argument; false otherwise.
    • getMaximumId

      public int getMaximumId()
      Get the maximum id assigned to this type or its children. The first call will cause all of the the ids in tree to be assigned, so it should not be called before the type is completely built.
      Returns:
      the maximum id assigned under this type
    • createRowBatch

      public org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch createRowBatch(TypeDescription.RowBatchVersion version, int size)
    • createRowBatchV2

      public org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch createRowBatchV2()
      Create a VectorizedRowBatch that uses Decimal64ColumnVector for short (p ≤ 18) decimals.
      Returns:
      a new VectorizedRowBatch
    • createRowBatch

      public org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch createRowBatch(int maxSize)
      Create a VectorizedRowBatch with the original ColumnVector types
      Parameters:
      maxSize - the maximum size of the batch
      Returns:
      a new VectorizedRowBatch
    • createRowBatch

      public org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch createRowBatch()
      Create a VectorizedRowBatch with the original ColumnVector types
      Returns:
      a new VectorizedRowBatch
    • getCategory

      public TypeDescription.Category getCategory()
      Get the kind of this type.
      Returns:
      get the category for this type.
    • getMaxLength

      public int getMaxLength()
      Get the maximum length of the type. Only used for char and varchar types.
      Returns:
      the maximum length of the string type
    • getPrecision

      public int getPrecision()
      Get the precision of the decimal type.
      Returns:
      the number of digits for the precision.
    • getScale

      public int getScale()
      Get the scale of the decimal type.
      Returns:
      the number of digits for the scale.
    • getFieldNames

      public List<String> getFieldNames()
      For struct types, get the list of field names.
      Returns:
      the list of field names.
    • getAttributeNames

      public List<String> getAttributeNames()
      Get the list of attribute names defined on this type.
      Returns:
      a list of sorted attribute names
    • getAttributeValue

      public String getAttributeValue(String attributeName)
      Get the value of a given attribute.
      Parameters:
      attributeName - the name of the attribute
      Returns:
      the value of the attribute or null if it isn't set
    • getParent

      public TypeDescription getParent()
      Get the parent of the current type
      Returns:
      null if root else parent
    • getChildren

      public List<TypeDescription> getChildren()
      Get the subtypes of this type.
      Returns:
      the list of children types
    • addChild

      public void addChild(TypeDescription child)
      Add a child to a type.
      Parameters:
      child - the child to add
    • printToBuffer

      public void printToBuffer(StringBuilder buffer)
    • toString

      public String toString()
      Overrides:
      toString in class Object
    • toJson

      public String toJson()
    • findSubtype

      public TypeDescription findSubtype(int goal)
      Locate a subtype by its id.
      Parameters:
      goal - the column id to look for
      Returns:
      the subtype
    • findSubtype

      public TypeDescription findSubtype(String columnName)
      Find a subtype of this schema by name. If the name is a simple integer, it will be used as a column number. Otherwise, this routine will recursively search for the name.
      • Struct fields are selected by name.
      • List children are selected by "_elem".
      • Map children are selected by "_key" or "_value".
      • Union children are selected by number starting at 0.
      Names are separated by '.'.
      Parameters:
      columnName - the name to search for
      Returns:
      the subtype
    • findSubtype

      public TypeDescription findSubtype(String columnName, boolean isSchemaEvolutionCaseAware)
    • findSubtypes

      public List<TypeDescription> findSubtypes(String columnNameList)
      Find a list of subtypes from a string, including the empty list. Each column name is separated by ','.
      Parameters:
      columnNameList - the list of column names
      Returns:
      the list of subtypes that correspond to the column names
    • annotateEncryption

      public void annotateEncryption(String encryption, String masks)
      Annotate a schema with the encryption keys and masks.
      Parameters:
      encryption - the encryption keys and the fields
      masks - the encryption masks and the fields
    • getFullFieldName

      public String getFullFieldName()
      Get the full field name for the given type. For "struct<a:struct<list<struct<b:int,c:int>>>>" when called on c, would return "a._elem.c".
      Returns:
      A string that is the inverse of findSubtype