News

ORC 2.3.1 Released

release

16 Jul 2026 dongjoon

The ORC team is excited to announce the release of ORC v2.3.1.

Released: 16 July 2026
Source code: orc-2.3.1.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-2.3.1
Maven Central: ORC 2.3.1
SHA 256: 6ef34f92cd81afbd…
Fixed issues: ORC-2.3.1

The bug fixes:

ORC-2123 [C++] Fix heap-use-after-free in ORC SearchArgument rewriteLeaves
ORC-2131 Set default of orc.stripe.size.check.ratio and orc.dictionary.max.size.bytes to 0
ORC-2151 [C++] Move the timestamp compensation before timezone conversion to prevent incorrect DST offset lookup
ORC-2160 [C++] Expose prefetch range planning via Reader::preBufferRange and refactor preBuffer to reuse it
ORC-2161 [C++] UnionColumnReader should reject out-of-range union tags
ORC-2164 [C++] Add external-buffer support in DataBuffer lifecycle management
ORC-2165 [C++] Fix bounds check for LZO stop command trailer
ORC-2167 [C++] Fix integer overflow in PostScript footer length validation
ORC-2177 Fix array conversion with empty first batch
ORC-2187 Set protobuf message size limit in C++ reader
ORC-2188 Reject invalid compression block size in PostScript
ORC-2190 [C++] Reject compressed chunk length exceeding block size in C++ reader
ORC-2191 Reject overflowing PostScript tail lengths in ReaderImpl
ORC-2192 [C++] Reject invalid string length in StringDirectColumnReader
ORC-2193 Parse encrypted FileStatistics via createCodedInputStream
ORC-2196 [C++] Validate type tree when creating reader from serialized file tail
ORC-2197 Reject negative decoded dictionary entry length in Java string dictionary reader
ORC-2198 Validate stream offset+length against stripe data boundary in StripePlanner
ORC-2199 Validate union tag against the number of children in UnionTreeReader
ORC-2200 Validate stripe offset/index/data/footer lengths in the Java reader

The tasks:

ORC-2117 Add Java 26-ea to GitHub Action build job
ORC-2121 Add ubi10 to docker tests and GitHub Actions job
ORC-2132 Use Java 26 instead of 26-ea
ORC-2133 Remove Super-Linter from GitHub Actions jobs
ORC-2134 Remove microsoft/setup-msbuild from GitHub Actions jobs
ORC-2181 Add ubuntu-26.04(-arm)? to GitHub Actions build job
ORC-2183 Use Visual Studio 18 2026 generator in Windows CIs

The build and dependency changes:

ORC-2120 Upgrade lz4-java to 0.10.4
ORC-2122 Upgrade spark.jackson.version to 2.21.1 in bench module
ORC-2124 Upgrade Maven to 3.9.13
ORC-2135 Upgrade mockito to 5.23.0
ORC-2137 Upgrade kryo-shaded to 4.0.3
ORC-2148 Upgrade lz4-java to 1.11.0
ORC-2150 Upgrade ZLIB to 1.3.2
ORC-2180 Upgrade threeten-extra to 1.9.0
ORC-2189 Add URL_HASH to all third-party dependencies in ThirdpartyToolchain.cmake
ORC-2195 Upgrade lz4-java to 1.11.1

The improvements (tools):

ORC-2149 Supports merging multiple ORC files with the same schema into multiple ORC files

Documentation:

ORC-2128 Add since Javadoc tags to all public orc-core class/interface/enum

ORC 1.9.9 Released

release

08 Jul 2026 dongjoon

The ORC team is excited to announce the release of ORC v1.9.9.

Released: 8 July 2026
Source code: orc-1.9.9.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-1.9.9
Maven Central: ORC 1.9.9
SHA 256: 0fe8c341543d72b6…
Fixed issues: ORC-1.9.9

The bug fixes:

ORC-2177: ORC-2177: Fix array conversion with empty first batch
Fix Integer Overflow in Seekable(Array|File)InputStream::Skip

The tasks:

ORC-2189 Add URL_HASH to all third-party dependencies in ThirdpartyToolchain.cmake

ORC 2.3.0 Released

release

02 Mar 2026 dongjoon

The ORC team is excited to announce the release of ORC v2.3.0.

Released: 2 March 2026
Source code: orc-2.3.0.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-2.3.0
Maven Central: ORC 2.3.0
SHA 256: 6c9e2f6663ac9ef3…
Fixed issues: ORC-2.3.0

New features:

ORC-2119: Support Java 25
ORC-2075: Support new Lz4Codec based on lz4-java
ORC-2083: Support XerialSnappyCodec
ORC-1986: Trigger flush stripe for large input rows
ORC-2000: [C++] Add support to prefetch small stripes
ORC-2002: [C++] Improve stripe prefetch
ORC-1969: [C++] Support async I/O prefetch of next stripe
ORC-2013: [C++] Bump CMake minimum requirement to 3.25 to leverage FetchContent

The improvements:

ORC-1994: [C++] Improve CMake by extracting OrcSanitizers.cmake
ORC-2008: [C++] Simplify CMake flags and compile options
ORC-2009: Remove unused code for CMake 3.6 and older
ORC-2014: Rename variables and configurations for periodic stripe size and dictionary size checks
ORC-2021: Fallback to UTC when /etc/localtime does not exist
ORC-2022: [C++] Add support to use dictionary for IN expression
ORC-2029: support Float fast read by memcpy in DoubleColumnReader
ORC-2031: Document orc.dictionary.max.size.bytes and orc.stripe.size.check.ratio
ORC-2036: optimize SortedStringDictionary performance
ORC-2038: Improve error message in TypeDescription.withPrecision()
ORC-2048: Use Java InputStream.skipNBytes instead of IOUtils.skipFully
ORC-2049: Move MAX_ARRAY_SIZE to RecordReaderUtils from IOUtils
ORC-2052: Remove unused IOUtils class
ORC-2053: Use Java Set.of instead of Collections.emptySet
ORC-2055: Use Java ArrayList constructors instead of Lists.newArrayList
ORC-2077: Introduce NullOptions class for CompressionCodec
ORC-2081: Support ORC LZ4 in bench module
ORC-2082: Support Parquet LZ4 in bench module
ORC-2085: Set strategy.max-parrallel to 20 for all GitHub Action jobs
ORC-2089: Disable Maven Parallel PUT
ORC-2090: Add a new label rule for MESON build
ORC-2091: Use HTTPS instead of HTTP
ORC-2111: Ensure Annotation Processing in Java compilation for Java 23+

The bug fixes:

ORC-1921: Upgrade Hadoop to 3.4.2
ORC-1966: ZSTD compress/decompress needs handle error properly
ORC-1967: C++ compilation issue with VS2022
ORC-1972: Upgrade ORC Format to 1.1.1
ORC-1973: Use int64_t instead of google::protobuf::int64 for Protobuf v22+
ORC-1974: Use google::protobuf::TextFormat instead of DebugString for Protobuf v30+
ORC-1977: Add Deprecated annotations for all deprecated APIs
ORC-2007: Upgrade gson to 2.13.2
ORC-2010: Use IANA Identifier America/Los_Angeles instead of US/Pacific in Java
ORC-2011: Fix Timezone to support legacy US TimeZone identifiers
ORC-2024: Upgrade zstd-jni to 1.5.7-5
ORC-2027: Undefined behavior in DoubleColumnReader::readFloat()
ORC-2028: evictEntriesBefore has deleted buffers used in unfinished coroutines causes panic
ORC-2032: Upgrade zstd-jni to 1.5.7-6
ORC-2042: Upgrade maven version to 3.9.12
ORC-2051: Fix Meson build to use ORC Format 1.1.1
ORC-2054: Fix Meson build version string to 2.3.0-SNAPSHOT
ORC-2069: Fix convert tool failed to read csv
ORC-2078: Fix TestConverter to respect test.tmp.dir
ORC-2087: Upgrade zstd-jni to 1.5.7-7
ORC-2103: Update CMake requirements to 3.25+ consistently
ORC-2105: Fix orc-format.wrap to use ORC Format 1.1.1

The test changes:

ORC-2112: Use Java 25 for Ubuntu 26.04 docker test
ORC-1968: Upgrade commons-cli to 1.10.0
ORC-1982: Upgrade brotli4j to 1.20.0
ORC-1992: Bump opencsv to 5.12.0
ORC-2040: Upgrade commons-cli to 1.11.0
ORC-2068: Upgrade Hadoop to 3.4.3
ORC-2084: Upgrade mockito to 5.21.0
ORC-1924: Add Windows 2025 GitHub Action job
ORC-1964: [CI] Fix CI ubsan-test with GNU
ORC-1965: Ban org.apache.commons.lang package
ORC-1970: [CI] Update cpp-linter-action to f91c446a32ae3eb9f98fef8c9ed4c7cb613a4f8a
ORC-1979: Upgrade commons-csv to 1.14.1
ORC-1980: Upgrade junit to 5.13.4
ORC-1983: Upgrade gtest to 1.17.0
ORC-1984: Add debian13 to docker tests, docs, and GitHub Action
ORC-1987: Upgrade Spark to 4.0.1 in bench module
ORC-1988: Upgrade Parquet to 1.16.0 in bench module
ORC-1989: Upgrade Hive to 4.1.0 in bench module
ORC-1990: Upgrade bcpkix-jdk18on to 1.81
ORC-1991: Upgrade snappy-java to 1.1.10.8 in bench module
ORC-1993: Upgrade spotless-maven-plugin to 2.46.1
ORC-1995: Add MacOS 26 to GitHub Action CI and docs
ORC-1996: Remove MacOS 13 from GitHub Action CI and docs
ORC-1997: Add a daily build-and-test GitHub Action Job for main branch
ORC-1998: Use Java 25 instead of 25-ea
ORC-1999: Upgrade Checkstyle to 11.0.1
ORC-2003: Upgrade guava to 33.5.0-jre
ORC-2004: Upgrade bouncycastle to 1.82
ORC-2005: Upgrade spotbugs-maven-plugin to 4.9.6
ORC-2006: Upgrade maven-shade-plugin to 3.6.1
ORC-2012: Remove US timezone workaround from Debian 13 Docker image
ORC-2015: Remove Debian 11 Support
ORC-2016: Upgrade CMake to 3.26.0 in amazonlinux:2023
ORC-2017: Upgrade checkstyle to 11.1.0
ORC-2018: Upgrade spotless-maven-plugin to 3.0.0
ORC-2019: Upgrade commons-lang3 to 3.19.0
ORC-2020: Upgrade junit to 6.0.0
ORC-2026: Upgrade maven-enforcer-plugin to 3.6.2
ORC-2034: Upgrade Checkstyle to 12.1.0
ORC-2037: Upgrade Spark to 4.1.0 and Scala to 2.13.17
ORC-2039: Upgrade junit to 6.0.1
ORC-2045: Upgrade checkstyle to 12.3.0
ORC-2050: Add MacOS 26 to meson/macos-cpp-check and use mainly in build
ORC-2056: Remove MacOS 14 from GitHub Action CI and docs
ORC-2058: Upgrade commons-lang3 to 3.20.0
ORC-2059: Upgrade spotless-maven-plugin to 3.1.0
ORC-2061: Upgrade byte-buddy to 1.18.4
ORC-2062: Upgrade objenesis to 3.5
ORC-2063: Upgrade Spark to 4.1.1
ORC-2064: Update oraclelinux9 to use dnf instead of yum
ORC-2065: Bump parquet to 1.17.0
ORC-2066: Upgrade spotbugs-maven-plugin to 4.9.8.2
ORC-2067: Upgrade junit to 6.0.2
ORC-2070: Add oraclelinux10 to docker tests and GitHub Action
ORC-2071: Upgrade spotless-maven-plugin to 3.2.1
ORC-2072: Remove OracleLinux 8 Support
ORC-2073: Fix JSONArgsRecommended warnings of Dockerfile
ORC-2074: Reduce GitHub Action concurrency
ORC-2076: Use license-check to check java directory
ORC-2079: Add lz4 codec pool test coverage
ORC-2086: Upgrade Spark to 4.2.0-preview2 and Netty to 4.2.10.Final
ORC-2088: Upgrade maven-dependency-plugin to 3.10.0
ORC-2092: Add ubuntu26 to docker tests and GitHub Action
ORC-2097: Make actions/* GitHub Actions jobs up-to-date
ORC-2098: Exclude .mvn/maven.config for apache-rat-plugin
ORC-2099: Remove Ubuntu 24.04 Support
ORC-2101: Enable GitHub Action CI in branch-2.3
ORC-2104: Update amazonlinux with 2023.10.20260202.2 and use dnf
ORC-2110: Enable Java 25 to build and verify all tests

The tasks:

ORC-1891: Upgrade to Apache parent pom 34 along with maven plugins
ORC-1951: Setting version to 2.3.0-SNAPSHOT
ORC-1975: Improve merge_orc_pr.py to accept PR numbers as a CLI argument
ORC-1978: Upgrade maven-enforcer-plugin to 3.6.1
ORC-1981: Upgrade build-helper-maven-plugin to 3.6.1
ORC-1985: Upgrade actions/checkout to v5
ORC-2001: Add method descriptions to all public Java interfaces
ORC-2023: Upgrade maven-dependency-plugin to 3.9.0
ORC-2025: Upgrade extra-enforcer-rules to 1.11.0
ORC-2043: Upgrade maven-jar-plugin to 3.5.0
ORC-2044: Upgrade maven-assembly-plugin to 3.8.0
ORC-2047: Add .vscode to .gitignore
ORC-2057: Add Pandas page at Using in Python section
ORC-2060: Upgrade bouncycastle to 1.83
ORC-2080: Add create_orc_jira.py script
ORC-2093: Remove labeler GitHub Action job
ORC-2096: Remove doc dependency from build GitHub Actions job

ORC 2.2.2 Released

release

12 Jan 2026 dongjoon

The ORC team is excited to announce the release of ORC v2.2.2.

Released: 12 January 2026
Source code: orc-2.2.2.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-2.2.2
Maven Central: ORC 2.2.2
SHA 256: 4214bfc8e0131630…
Fixed issues: ORC-2.2.2

The bug fixes:

ORC-2027: [C++] Fix undefined behavior in DoubleColumnReader::readFloat()
ORC-2032: Upgrade zstd-jni to 1.5.7-6
ORC-2040: Upgrade commons-cli to 1.11.0
ORC-2051: Fix Meson build to use ORC Format 1.1.1

The test changes:

ORC-2037: Upgrade Spark to 4.1.0 and Scala to 2.13.17
ORC-2041: Update cpp-linter-action hash to match ASF infra
ORC-2050: Add MacOS 26 to meson/macos-cpp-check and use mainly in build GitHub Action job
ORC-2058: Upgrade commons-lang3 to 3.20.0

The build and dependency changes:

ORC-2042 Upgrade maven.version to 3.9.12

The documentation changes:

ORC-2057 Add Pandas page at Using in Python section

ORC 2.1.4 Released

release

09 Jan 2026 william

The ORC team is excited to announce the release of ORC v2.1.4.

Released: 9 January 2026
Source code: orc-2.1.4.tar.gz
GPG Signature signed by William Hyun (DECDFA29)
Git tag: rel/release-2.1.4
Maven Central: ORC 2.1.4
SHA 256: c3481370582dea92…
Fixed issues: ORC-2.1.4

The bug fixes:

ORC-1892: Upgrade snappy to 1.2.2
ORC-1893: Upgrade zstd to 1.5.7
ORC-1952: [C++] Fix the issue where the value of headerThirdByte exceeds the valid byte range
ORC-1973: Use int64_t instead of google::protobuf::int64 for Protobuf v22+
ORC-1974: Use google::protobuf::TextFormat instead of DebugString for Protobuf v30+
ORC-2010: Use IANA Identifier America/Los_Angeles instead of US/Pacific in Java
ORC-2027: Undefined behavior in DoubleColumnReader::readFloat()

The test changes:

ORC-1970: [CI] Update cpp-linter-action to f91c446a32ae3eb9f98fef8c9ed4c7cb613a4f8a
ORC-1996: Remove MacOS 13 from GitHub Action CI and docs

ORC 2.0.7 Released

release

08 Jan 2026 william

The ORC team is excited to announce the release of ORC v2.0.7.

Released: 8 January 2026
Source code: orc-2.0.7.tar.gz
GPG Signature signed by William Hyun (DECDFA29)
Git tag: rel/release-2.0.7
Maven Central: ORC 2.0.7
SHA 256: fc331c4f46b5d65c…
Fixed issues: ORC-2.0.7

The bug fixes:

ORC-1825: Bump Snappy to 1.2.1
ORC-1827: Bump ZLIB to 1.3.1
ORC-1828: Bump LZ4 to 1.10.0
ORC-1892: Upgrade snappy to 1.2.2
ORC-1893: Upgrade zstd to 1.5.7
ORC-1973: Use int64_t instead of google::protobuf::int64 for Protobuf v22+
ORC-1974: Use google::protobuf::TextFormat instead of DebugString for Protobuf v30+
ORC-2010: Use IANA Identifier America/Los_Angeles instead of US/Pacific in Java

The test changes:

ORC-1996: Remove MacOS 13 from GitHub Action CI and docs

Tasks:

ORC-1896: Add CMAKE_POLICY_VERSION_MINIMUM=3.12 to ThirdpartyToolchain.cmake

ORC 1.9.8 Released

release

08 Jan 2026 william

The ORC team is excited to announce the release of ORC v1.9.8.

Released: 8 January 2026
Source code: orc-1.9.8.tar.gz
GPG Signature signed by William Hyun (DECDFA29)
Git tag: rel/release-1.9.8
Maven Central: ORC 1.9.8
SHA 256: f672af2ed1aef5a2…
Fixed issues: ORC-1.9.8

The bug fixes:

ORC-2035: Upgrade protobuf-java to 3.25.5

The test changes:

ORC-1996 Remove MacOS 13 from GitHub Action CI and docs
ORC-2046 Fix macos-14-arm64 CI failure

ORC 2.2.1 Released

release

01 Oct 2025 dongjoon

The ORC team is excited to announce the release of ORC v2.2.1.

Released: 1 October 2025
Source code: orc-2.2.1.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-2.2.1
Maven Central: ORC 2.2.1
SHA 256: f3d21c5adc097059…
Fixed issues: ORC-2.2.1

The bug fixes:

ORC-1966: [C++] Fix ZSTD compress/decompress to propagate errors
ORC-1967: [C++] Fix Windows build
ORC-1968: Upgrade commons-cli to 1.10.0
ORC-1972: Upgrade ORC Format to 1.1.1
ORC-1973: [C++] Use int64_t instead of google::protobuf::int64 for Protobuf v22+
ORC-1974: [C++] Use google::protobuf::TextFormat instead of DebugString for Protobuf v30+
ORC-1977: Add Deprecated annotations for all deprecated APIs
ORC-1979: Upgrade commons-csv to 1.14.1
ORC-2007: Upgrade gson to 2.13.2
ORC-2010: Use IANA Identifier America/Los_Angeles instead of US/Pacific in Java
ORC-2011: [C++] Fix Timezone to support legacy US TimeZone identifiers
ORC-2012: Remove US timezone workaround from Debian 13 Docker image

The test changes:

ORC-1964 [CI] Fix CI ubsan-test with GNU
ORC-1970 [CI] Fix cpp-linter-action to use hash tag
ORC-1984 Add debian13 to docker tests, docs, and GitHub Action
ORC-1995 Add MacOS 26 to GitHub Action CI and docs
ORC-1996 Remove MacOS 13 from GitHub Action CI and docs

The build and dependency changes:

ORC-1921 Upgrade Hadoop to 3.4.2
ORC-1978 Upgrade maven-enforcer-plugin to 3.6.1
ORC-1980 Upgrade junit to 5.13.4
ORC-1983 [C++] Upgrade gtest to 1.17.0
ORC-1987 Upgrade Spark to 4.0.1 in bench module
ORC-1988 Upgrade Parquet to 1.16.0 in bench module
ORC-1989 Upgrade Hive to 4.1.0 in bench module
ORC-1998 Use Java 25 instead of 25-ea
ORC-1999 Upgrade Checkstyle to 11.0.1
ORC-2003 Upgrade guava to 33.5.0-jre

ORC 2.2.0 Released

release

29 Jul 2025 william

The ORC team is excited to announce the release of ORC v2.2.0.

Released: 29 July 2025
Source code: orc-2.2.0.tar.gz
GPG Signature signed by William Hyun (DECDFA29)
Git tag: rel/release-2.2.0
Maven Central: ORC 2.2.0
SHA 256: 9a9cff7efa81c419…
Fixed issues: ORC-2.2.0

The new features:

ORC-1903: Support Geometry and Geography types
ORC-1920: [C++] Support Geometry and Geography types
ORC-1884: [C++] Add the maybe() function to the SearchArgumentBuilder
ORC-1906: Support Meson build

The improvements:

ORC-1838: Bump opencsv to 5.10
ORC-1841: [C++] Add UBSAN to CI
ORC-1848: PrintData tool add parameter description
ORC-1858: Add a c++ api that only get stripe level statistics without reading row group index
ORC-1880: [C++] Add invalid argument check for NOT Operator in ExpressionTree
ORC-1894: Add CMAKE_POLICY_VERSION_MINIMUM=3.12 to PROTOBUF_CMAKE_ARGS
ORC-1905: Upgrade Maven to 3.9.10
ORC-1931: Suppress Hadoop logs lower than ERROR level in orc-tools
ORC-1932: Use setIfUnset for fs.defaultFS and fs.file.impl.disable.cache
ORC-1933: Change org.jetbrains:annotations dependency to the provided scope
ORC-1936: Get example and build dir for tools test from Build System instead of gtest
ORC-1937: Use the default buildtype in Meson config
ORC-1938: Update tools module to set fs.file.impl.disable.cache only for Java 22+
ORC-1946: [C++] Fix the issue discovered by UBSAN.
ORC-1950: [C++] Replace std::unorder_map with google dense_hash_map in SortedStringDictionary and remove reorder to improve write performance of dict-encoding columns
ORC-1961: Support orc.compression.zstd.strategy

The bug fixes:

ORC-1833: [C++] Fix CMake script to be used inside another project
ORC-1835: [C++] Fix cpp-linter-action to build first
ORC-1836: Upgrade zstd-jni to 1.5.6-9
ORC-1846: [C++] Fix imported libraries in the Conan build
ORC-1851: Upgrade zstd-jni to 1.5.6-10
ORC-1853: Rename class TesScanData to TestScanData
ORC-1854: Remove ubuntu20 from os-list.txt
ORC-1863: Upgrade slf4j to 2.0.17
ORC-1865: Upgrade zstd-jni to 1.5.7-2
ORC-1866: Avoid zlib decompression infinite loop
ORC-1876: Upgrade to ORC Format 1.1
ORC-1879: Fix Heap Buffer Overflow in LZO Decompression
ORC-1881: [C++] The decimal scale and precision become zero in ColumnVectorBatch when converting between decimal types.
ORC-1892: Upgrade snappy to 1.2.2
ORC-1893: Upgrade zstd to 1.5.7
ORC-1898: When column is all null, NULL_SAFE_EQUALS pushdown doesn’t get evaluated correctly
ORC-1929: Fix the Javadoc of ZstdCodec.compress
ORC-1934: Upgrade protobuf-java to 3.25.8
ORC-1939: TimestampFrom…TreeReader should set isUTC flag in TimestampColumnVector
ORC-1940: Meson configuration should add thread dependency to orc lib
ORC-1942: Fix PhysicalFsWriter to change tempOptions directly
ORC-1948: Fix GeospatialTreeWriter#writeBatch updating ColumnStatistics with incorrect values
ORC-1952: [C++] Fix the issue where the value of headerThirdByte exceeds the valid byte range
ORC-1954: Fix CI asan-test
ORC-1957: Upgrade zstd-jni to 1.5.7-4

The test changes:

ORC-1839: Upgrade spotless-maven-plugin to 2.44.1
ORC-1842: Upgrade commons-csv to 1.13.0
ORC-1844: Upgrade spotless-maven-plugin to 2.44.2
ORC-1847: Upgrade Hive to 4.0.1 in bench module
ORC-1849: Upgrade byte-buddy to 1.17.0
ORC-1850: Upgrade maven-surefire-plugin to 3.5.2
ORC-1855: Add Amazon Linux 2023 and Corretto to docker tests and CI
ORC-1856: Bump spotbugs-maven-plugin to 4.9.1.0
ORC-1857: Bump checkstyle to 10.21.2
ORC-1859: Upgrade junit to 5.12.0
ORC-1860: Upgrade spotless-maven-plugin to 2.44.3
ORC-1861: Upgrade junit to 5.12.1
ORC-1862: Upgrade spotbugs-maven-plugin to 4.9.3.0
ORC-1864: Upgrade checkstyle to 10.21.4
ORC-1867: Upgrade commons-csv to 1.14.0 in bench module
ORC-1868: Upgrade parquet to 1.15.1 in bench module
ORC-1871: Include iomanip at Test(DictionaryEncoding|ConvertColumnReader)
ORC-1872: Upgrade extra-enforcer-rules to 1.10.0
ORC-1875: Support ubuntu-24.04-arm in GitHub Action CIs
ORC-1882: Upgrade spotless-maven-plugin to 2.44.4
ORC-1883: Upgrade checkstyle to 10.23.0
ORC-1886: Upgrade junit to 5.12.2
ORC-1887: Upgrade checkstyle to 10.23.1
ORC-1889: Upgrade parquet to 1.15.2
ORC-1899: Upgrade Spark to 4.0.0 and Scala to 2.13.16
ORC-1900: Upgrade Jackson to 2.18.2 in bench module
ORC-1901: Remove threeten-extra exclusion in enforceBytecodeVersion rule
ORC-1904: Upgrade checkstyle to 10.25.0
ORC-1907: Upgrade byte-buddy to 1.17.5
ORC-1908: Add --enable-native-access=ALL-UNNAMED to Surefire argLine
ORC-1909: Remove unused test resource log4j.properties files
ORC-1910: Add -XX:+EnableDynamicAgentLoading to Surefire argLine
ORC-1911: Update CIs to use actions/checkout@v4 consistently
ORC-1913: Fix TestColumnStatistics to set testFilePath with absolute path
ORC-1915: Remove Fedora 35 Support
ORC-1916: Add Java 25-ea build CI
ORC-1917: Add TestConf interface to centralize test configurations
ORC-1918: Add Java 25-ea test coverage for shims and core modules
ORC-1923: Remove Windows 2019 GitHub Action job
ORC-1925: Add oraclelinux8 to docker tests and GitHub Action
ORC-1926: Use TestConf interface in tools module
ORC-1927: Add Java 25-ea test coverage for tools modules
ORC-1930: Improve GenerateVariants to accept ORC configs via system properties
ORC-1935: Upgrade checkstyle to 10.25.1
ORC-1941: Upgrade checkstyle to 10.26.1
ORC-1943: Add com.google.protobuf.use_unsafe_pre22_gencode to Surefire testing
ORC-1944: Upgrade spotbugs to 4.9.3
ORC-1947: Upgrade maven-enforcer-plugin to 3.6.0
ORC-1953: Upgrade commons-lang3 to 3.18.0
ORC-1955: Make commons-lang3 as a test dependency explicitly
ORC-1956: Enable GitHub Action CI in branch-2.2
ORC-1959: Add test String statistics with Presto writer

Tasks

ORC-1837: Remove commons-csv from parent pom.xml
ORC-1852: Add --enable-native-access=ALL-UNNAMED to suppress Maven warnings
ORC-1877: Upgrade gson to 2.13.0
ORC-1902: Use super-linter for README.md files
ORC-1914: Ensure Annotation Processing in core module compilation
ORC-1919: Update .asf.yaml with new README.md link
ORC-1928: Upgrade junit to 5.13.1
ORC-1945: Update Python documentation with PyArrow 20.0.0 and Dask 2025.5.1
ORC-1958: Upgrade Maven to 3.9.11
ORC-1962: Fix publish_snapshot.yml in branch-2.2 to publish

ORC 2.1.3 Released

release

09 Jul 2025 dongjoon

The ORC team is excited to announce the release of ORC v2.1.3.

Released: 9 July 2025
Source code: orc-2.1.3.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-2.1.3
Maven Central: ORC 2.1.3
SHA 256: 75f3a876eb520ec8…
Fixed issues: ORC-2.1.3

The bug fixes:

ORC-1898: When column is all null, NULL_SAFE_EQUALS pushdown doesn’t get evaluated correctly
ORC-1929: Fix the Javadoc of ZstdCodec.compress
ORC-1942: Fix PhysicalFsWriter to change tempOptions directly

The improvement changes:

ORC-1931: Suppress Hadoop logs lower than ERROR level in orc-tools

The test changes:

ORC-1899 Upgrade Spark to 4.0.0 and Scala to 2.13.16
ORC-1900 Upgrade Jackson to 2.18.2 in bench module
ORC-1907 Upgrade byte-buddy to 1.17.5
ORC-1908 Add --enable-native-access=ALL-UNNAMED to Surefire argLine
ORC-1909 Remove unused test resource log4j.properties files
ORC-1910 Add -XX:+EnableDynamicAgentLoading to Surefire argLine
ORC-1911 Update CIs to use actions/checkout@v4 consistently
ORC-1915 Remove Fedora 35 Support
ORC-1917 Add TestConf interface to centralize test configurations
ORC-1923 Remove Windows 2019 GitHub Action job
ORC-1943 Add com.google.protobuf.use_unsafe_pre22_gencode to Surefire testing
ORC-1944 Upgrade spotbugs to 4.9.3
ORC-1945 Update Python documentation with PyArrow 20.0.0 and Dask 2025.5.1

The build and dependency changes:

ORC-1896 Add CMAKE_POLICY_VERSION_MINIMUM=3.12 to ThirdpartyToolchain.cmake
ORC-1901 Remove threeten-extra exclusion in enforceBytecodeVersion rule
ORC-1914 Ensure Annotation Processing in core module compilation
ORC-1934 Upgrade protobuf-java to 3.25.8

ORC 2.0.6 Released

release

07 Jul 2025 dongjoon

The ORC team is excited to announce the release of ORC v2.0.6.

Released: 7 July 2025
Source code: orc-2.0.6.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-2.0.6
Maven Central: ORC 2.0.6
SHA 256: 81167d31d7ec51de…
Fixed issues: ORC-2.0.6

The bug fixes:

ORC-1898: When column is all null, NULL_SAFE_EQUALS pushdown doesn’t get evaluated correctly
ORC-1929: Fix the Javadoc of ZstdCodec.compress
ORC-1942: Fix PhysicalFsWriter to change tempOptions directly

The test changes:

ORC-1728 Bump maven-shade-plugin to 3.6.0
ORC-1872 Upgrade extra-enforcer-rules to 1.10.0
ORC-1889 Upgrade parquet to 1.15.2
ORC-1899 Upgrade Spark to 4.0.0 and Scala to 2.13.16
ORC-1900 Upgrade Jackson to 2.18.2 in bench module
ORC-1901 Remove threeten-extra exclusion in enforceBytecodeVersion rule
ORC-1909 Remove unused test resource log4j.properties files
ORC-1923 Remove Windows 2019 GitHub Action job

ORC 1.9.7 Released

release

04 Jul 2025 dongjoon

The ORC team is excited to announce the release of ORC v1.9.7.

Released: 4 July 2025
Source code: orc-1.9.7.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-1.9.7
Maven Central: ORC 1.9.7
SHA 256: 3b3b18f472f8edf3…
Fixed issues: ORC-1.9.7

The bug fixes:

ORC-1898: When column is all null, NULL_SAFE_EQUALS pushdown doesn’t get evaluated correctly

The test changes:

ORC-1909 Remove unused test resource log4j.properties files
ORC-1923 Remove Windows 2019 GitHub Action job

ORC 1.8.10 Released

release

26 Jun 2025 dongjoon

The ORC team is excited to announce the release of ORC v1.8.10.

Released: 26 June 2025
Source code: orc-1.8.10.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-1.8.10
Maven Central: ORC 1.8.10
SHA 256: c204243c55d34d83…
Fixed issues: ORC-1.8.10

The bug fixes:

ORC-1898: When column is all null, NULL_SAFE_EQUALS pushdown doesn’t get evaluated correctly

The test changes:

ORC-1909 Remove unused test resource log4j.properties files
ORC-1923 Remove Windows 2019 GitHub Action job

ORC 1.9.6 Released

release

06 May 2025 wgtmac

The ORC team is excited to announce the release of ORC v1.9.6.

Released: 6 May 2025
Source code: orc-1.9.6.tar.gz
GPG Signature signed by Gang Wu (8A461DF4)
Git tag: rel/release-1.9.6
Maven Central: ORC 1.9.6
SHA 256: 4442944f53b6b4d4…
Fixed issues: ORC-1.9.6

The bug fixes:

ORC-1866 Avoid zlib decompression infinite loop
ORC-1879 Fix Heap Buffer Overflow in LZO Decompression
ORC-1885 Update all ubuntu-20.04 to ubuntu-22.04 in CI

The test changes:

ORC-1745 Remove Ubuntu 20.04 Support
ORC-1776 Remove MacOS 12 from GitHub Action CI and docs
ORC-1818 Upgrade Spark to 3.5.4 in bench module
ORC-1869 Upgrade Spark to 3.5.5 in bench module for Apache ORC 1.9.x

The tasks:

ORC-1709 Upgrade GitHub Action setup-java to v4 and use built-in cache feature

ORC 1.8.9 Released

release

06 May 2025 wgtmac

The ORC team is excited to announce the release of ORC v1.8.9.

Released: 6 May 2025
Source code: orc-1.8.9.tar.gz
GPG Signature signed by Gang Wu (8A461DF4)
Git tag: rel/release-1.8.9
Maven Central: ORC 1.8.9
SHA 256: 66343dc6832beda9…
Fixed issues: ORC-1.8.9

The bug fixes:

ORC-1866 Avoid zlib decompression infinite loop
ORC-1879 Fix Heap Buffer Overflow in LZO Decompression

The test changes:

ORC-1745 Remove Ubuntu 20.04 Support
ORC-1776 Remove MacOS 12 from GitHub Action CI and docs
ORC-1870 Remove Java 18 test pipeline from branch-1.8

The tasks:

ORC-1411 Remove Ubuntu18.04 from docker-based tests
ORC-1709 Upgrade GitHub Action setup-java to v4 and use built-in cache feature

ORC 2.1.2 Released

release

06 May 2025 dongjoon

The ORC team is excited to announce the release of ORC v2.1.2.

Released: 6 May 2025
Source code: orc-2.1.2.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-2.1.2
Maven Central: ORC 2.1.2
SHA 256: 55451e65dea6ed42…
Fixed issues: ORC-2.1.2

The bug fixes:

ORC-1866 Avoid zlib decompression infinite loop
ORC-1879 [C++] Fix Heap Buffer Overflow in LZO Decompression
ORC-1881 [C++] Populate dstBatch’s scale and precision in DecimalConvertColumnReader

The test changes:

ORC-1871 [C++] Include iomanip at TestDictionaryEncoding and TestConvertColumnReader
ORC-1872 Upgrade extra-enforcer-rules to 1.10.0
ORC-1875 Support ubuntu-24.04-arm in GitHub Action CIs

The build and dependency changes:

ORC-1876 Upgrade to ORC Format 1.1

ORC 2.0.5 Released

release

06 May 2025 dongjoon

The ORC team is excited to announce the release of ORC v2.0.5.

Released: 6 May 2025
Source code: orc-2.0.5.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-2.0.5
Maven Central: ORC 2.0.5
SHA 256: 35dc3ad801f632f4…
Fixed issues: ORC-2.0.5

The bug fixes:

ORC-1866 Avoid zlib decompression infinite loop
ORC-1879 [C++] Fix Heap Buffer Overflow in LZO Decompression
ORC-1881 [C++] Populate dstBatch’s scale and precision in DecimalConvertColumnReader

The test changes:

ORC-1745 Remove Ubuntu 20.04 Support
ORC-1822 [C++][CI] Use cpp-linter-action for clang-tidy and clang-format
ORC-1835 [C++] Fix cpp-linter-action to build first
ORC-1871 [C++] Include iomanip at TestDictionaryEncoding and TestConvertColumnReader

The Apache ORC Project Management Committee (PMC) is happy to announce that Shaoyun Chen has joined us as a new member of the PMC. Chaoyun has been showing consistent contributions as a committer, and participated in both major and maintenance releases by actively helping the release managers with testing the release candidates.

Please join me in welcoming Shaoyun to the ORC PMC!

ORC 2.0.4 Released

release

20 Mar 2025 dongjoon

The ORC team is excited to announce the release of ORC v2.0.4.

Released: 20 March 2025
Source code: orc-2.0.4.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-2.0.4
Maven Central: ORC 2.0.4
SHA 256: 9525a76fae64a6da…
Fixed issues: ORC-2.0.4

The improvements (tools):

ORC-1848 PrintData tool add parameter description

The bug fixes:

ORC-1813 [C++] Fix has_null forward compatibility

The test changes:

ORC-1853 Rename class TesScanData to TestScanData
ORC-1855 Add Amazon Linux 2023 and Corretto to docker tests and CI

The build and dependency changes:

ORC-1709 Upgrade GitHub Action setup-java to v4 and use built-in cache feature
ORC-1804 Upgrade parquet to 1.14.4 in bench module
ORC-1810 [C++] Add environment variable ORC_FORMAT_URL
ORC-1811 Use the recommended closer.lua URL to download ORC format
ORC-1812 Upgrade parquet to 1.15.0 in bench module
ORC-1814 Use Ubuntu 24.04/Jekyll 4.3/Rouge 4.5 to generate website
ORC-1837 Remove commons-csv from parent pom.xml
ORC-1847 Upgrade Hive to 4.0.1 in bench module
ORC-1851 Upgrade zstd-jni to 1.5.6-10

The tasks:

ORC-1815 Remove broken people.apache.org links

ORC 2.1.1 Released

release

06 Mar 2025 dongjoon

The ORC team is excited to announce the release of ORC v2.1.1.

Released: 6 March 2025
Source code: orc-2.1.1.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-2.1.1
Maven Central: ORC 2.1.1
SHA 256: 15af8baeee322bab…
Fixed issues: ORC-2.1.1

The improvements (tools):

ORC-1848 PrintData tool add parameter description

The bug fixes:

ORC-1833 [C++] Fix CMake script to be used inside another project
ORC-1834 [C++] Fix undefined behavior
ORC-1846 [C++] Fix imported libraries in the Conan build

The test changes:

ORC-1835 [C++] Fix cpp-linter-action to build first
ORC-1853 Rename class TesScanData to TestScanData
ORC-1854 Remove ubuntu20 from os-list.txt
ORC-1855 Add Amazon Linux 2023 and Corretto to docker tests and CI

The build and dependency changes:

ORC-1836 Upgrade zstd-jni to 1.5.6-9
ORC-1837 Remove commons-csv from parent pom.xml
ORC-1843 Upgrade bcpkix-jdk18on to 1.80
ORC-1847 Upgrade Hive to 4.0.1 in bench module
ORC-1849 Upgrade byte-buddy to 1.17.0
ORC-1850 Upgrade maven-surefire-plugin to 3.5.2
ORC-1851 Upgrade zstd-jni to 1.5.6-10
ORC-1852 Add –enable-native-access=ALL-UNNAMED to suppress Maven warnings
ORC-1856 Bump spotbugs-maven-plugin to 4.9.1.0
ORC-1859 Upgrade junit to 5.12.0

The tasks:

ORC-1840 Add Matomo script to support https://analytics.apache.org

ORC 2.1.0 Released

release

09 Jan 2025 william

The ORC team is excited to announce the release of ORC v2.1.0.

Released: 9 January 2025
Source code: orc-2.1.0.tar.gz
GPG Signature signed by William Hyun (DECDFA29)
Git tag: rel/release-2.1.0
Maven Central: ORC 2.1.0
SHA 256: 1ffac0228aa83f04…
Fixed issues: ORC-2.1.0

New Feature

ORC-262 [C++] Support async prefetch in Orc reader
ORC-1388 [C++] Support schema evolution from decimal to timestamp/string group
ORC-1389 [C++] Support schema evolution from string group to numeric/string group
ORC-1390 [C++] Support schema evolution from string group to decimal/timestamp
ORC-1622 [C++] Support conan packaging
ORC-1807 [C++] Native support for vcpkg

Improvement

ORC-1264 [C++] Add a writer option to align compression block with row group boundary
ORC-1365 [C++] Use BlockBuffer to replace DataBuffer of rawInputBuffer in the CompressionStream
ORC-1635 Try downloading orc-format from dlcdn.apache.org before archive.apache.org
ORC-1645 Evaulate stripe stats before load stripe footer
ORC-1658 [C++] uniform identifiers naming style.
ORC-1661 [C++] Better handling when TZDB is unavailable
ORC-1664 Enable the removeUnusedImports function in spotless-maven-plugin
ORC-1665 Enable the importOrder function in spotless-maven-plugin
ORC-1667 Add check tool to check the index of the specified column
ORC-1669 [C++] Deprecate HDFS support
ORC-1672 Modify the package name of TestCheckTool
ORC-1675 [C++] Print decimal values as strings
ORC-1677 [C++] remove m prefix of variables.
ORC-1683 Fix instanceof of BinaryStatisticsImpl merge method
ORC-1684 [C++] Find tzdb without TZDIR when in conda-environments
ORC-1685 Use Pattern Matching for instanceof in RecordReaderImpl
ORC-1686 [C++] Avoid using std::filesystem
ORC-1687 [C++] Enforce naming style.
ORC-1688 [C++] Do not access TZDB if there is no timestamp type
ORC-1689 [C++] Generate CMake config file
ORC-1690 [C++] Refactor CMake to use imported thirdtparty libraries
ORC-1710 Reduce enum array allocation
ORC-1711 [C++] Introduce a memory block size parameter for writer option
ORC-1720 [C++] Unified compressor/decompressor exception types
ORC-1724 JsonFileDump utility should print user metadata
ORC-1730 [C++] Add finishEncode support for the encoder
ORC-1732 [C++] Can’t detect Protobuf installed by Homebrew on macOS
ORC-1733 [C++] [CMake] Fix CMAKE_MODULE_PATH not to use PROJECT_SOURCE_DIR
ORC-1751 [C++] Syntax error in ThirdpartyToolchain
ORC-1767 [C++] Improve writing performance of encoded string column and support EncodedStringVectorBatch for StringColumnWriter
ORC-1796 [C++] Reading orc file which lack of Statistics may give wrong result
ORC-1810 Offline build support

Bug Fix

ORC-1654 [C++] Count up EvaluatedRowGroupCount correctly.
ORC-1657 Fix building apache orc with clang-cl on Windows
ORC-1706 [C++] Fix build break w/ BUILD_CPP_ENABLE_METRICS=ON
ORC-1725 [C++] Statistics for BYTE type are calculated incorrectly on ARM
ORC-1738 Wrong Int128 maximum value
ORC-1811 Use the recommended closer.lua URL to download ORC format
ORC-1813 Incompatibility with ORC files written in version 0.12 due to missing hasNull field in C++ Reader

Task

ORC-1573 Setting version to 2.1.0-SNAPSHOT
ORC-1594 Add IntelliJ conf in the project root directory to support JIRA/PR autolinks
ORC-1649 [C++] [Conan] Add 2.0.0 to conan recipe and update release guide
ORC-1655 Add label definition to conan directory
ORC-1656 Skip build and test on conan updates
ORC-1666 Remove extra newlines at the end of Java files
ORC-1758 Use OpenContainers Annotations in docker images
ORC-1802 Enable tag protection

Test

ORC-1589 Bump spotbugs-maven-plugin to 4.8.3.0
ORC-1590 Bump spotless-maven-plugin to 2.42.0
ORC-1603 Bump checkstyle to 10.13.0
ORC-1606 Upgrade spotless-maven-plugin to 2.43.0
ORC-1611 Bump junit to 5.10.2
ORC-1651 Bump checkstyle to 10.14.0
ORC-1652 Bump extra-enforcer-rules to 1.8.0
ORC-1653 Bump maven-assembly-plugin to 3.7.0
ORC-1659 Bump guava to 33.1.0-jre
ORC-1660 Bump checkstyle to 10.14.2
ORC-1673 Remove test packages o.a.o.tools.[count|merge|sizes]
ORC-1676 Use Hive 4.0.0 in benchmark
ORC-1678 Bump checkstyle to 10.15.0
ORC-1680 Bump bcpkix-jdk18on to 1.78
ORC-1691 Bump spotbugs-maven-plugin to 4.8.4.0
ORC-1694 Upgrade gson to 2.9.0 for Benchmarks Hive
ORC-1695 Upgrade gson to 2.10.1
ORC-1699 Fix SparkBenchmark in Parquet format according to SPARK-40918
ORC-1700 Write parquet decimal type data in Benchmark using FIXED_LEN_BYTE_ARRAY type
ORC-1704 Migration to Scala 2.13 of Apache Spark 3.5.1 at SparkBenchmark
ORC-1707 Fix sun.util.calendar IllegalAccessException when SparkBenchmark runs on JDK17
ORC-1708 Support data/compress options in Hive benchmark
ORC-1709 Upgrade GitHub Action setup-java to v4 and use built-in cache feature
ORC-1713 Bump spotbugs-maven-plugin to 4.8.5.0
ORC-1716 Bump com.puppycrawl.tools:checkstyle to 10.16.0
ORC-1719 Bump guava to 33.2.0-jre
ORC-1722 Bump checkstyle to 10.17.0
ORC-1726 Bump guava to 33.2.1-jre
ORC-1727 Bump maven-enforcer-plugin to 3.5.0
ORC-1728 Bump maven-shade-plugin to 3.6.0
ORC-1729 Bump maven-checkstyle-plugin to 3.4.0
ORC-1731 Upgrade maven-dependency-plugin to 3.7.0
ORC-1735 Upgrade maven-dependency-plugin to 3.7.1
ORC-1736 Bump junit to 5.10.3
ORC-1737 Bump spotbugs-maven-plugin to 4.8.6.1
ORC-1739 Bump spotbugs-maven-plugin to 4.8.6.2
ORC-1745 Remove Ubuntu 20.04 Support
ORC-1750 Bump protobuf-java to 3.25.4
ORC-1756 Bump snappy-java to 1.1.10.6 in bench module
ORC-1760 Upgrade junit to 5.11.0
ORC-1761 Upgrade guava to 33.3.0-jre
ORC-1763 Upgrade checkstyle to 10.18.0
ORC-1764 Upgrade maven-checkstyle-plugin to 3.5.0
ORC-1765 Upgrade maven-dependency-plugin to 3.8.0
ORC-1771 Upgrade checkstyle to 10.18.1
ORC-1772 Bump spotbugs-maven-plugin to 4.8.6.3
ORC-1774 Upgrade snappy-java to 1.1.10.7 in bench module
ORC-1776 Remove MacOS 12 from GitHub Action CI and docs
ORC-1778 Upgrade Spark to 4.0.0-preview2
ORC-1779 Upgrade extra-enforcer-rules to 1.9.0
ORC-1780 Upgrade spotbugs-maven-plugin to 4.8.6.4
ORC-1783 Add MacOS 15 to GitHub Action MacOS CI and docs
ORC-1786 Upgrade guava to 33.3.1-jre
ORC-1788 Upgrade checkstyle to 10.18.2
ORC-1789 Upgrade junit to 5.11.2
ORC-1790 Upgrade parquet to 1.14.3 in bench module
ORC-1794 Upgrade checkstyle to 10.19.0
ORC-1795 Upgrade junit to 5.11.3
ORC-1797 Upgrade spotbugs-maven-plugin to 4.8.6.5
ORC-1799 Upgrade maven-checkstyle-plugin to 3.6.0
ORC-1801 Upgrade checkstyle to 10.20.0
ORC-1804 Upgrade parquet to 1.14.4 in bench module
ORC-1805 Upgrade checkstyle to 10.20.1
ORC-1806 Upgrade spotbugs-maven-plugin to 4.8.6.6
ORC-1809 Upgrade checkstyle to 10.20.2
ORC-1812 Upgrade parquet to 1.15.0 in bench module
ORC-1816 Upgrade checkstyle to 10.21.0
ORC-1820 Bump junit.version to 5.11.4
ORC-1821 Upgrade guava to 33.4.0-jre
ORC-1822 [C++] [CI] Use cpp-linter-action for clang-tidy and clang-format
ORC-1823 Upgrade checkstyle to 10.21.1
ORC-1826 [C++] Add ASAN to CI

Build and Dependency Changes

ORC-1608 Upgrade Hadoop to 3.4.0
ORC-1617 Upgrade slf4j to 2.0.12
ORC-1640 Upgrade cyclonedx-maven-plugin to 2.7.11
ORC-1650 Bump maven-shade-plugin to 3.5.2
ORC-1670 Upgrade zstd-jni to 1.5.6-1
ORC-1679 Bump zstd-jni 1.5.6-2
ORC-1682 Bump maven-assembly-plugin to 3.7.1
ORC-1692 Bump slf4j to 2.0.13
ORC-1693 Bump maven-jar-plugin to 3.4.0
ORC-1698 Upgrade commons-cli to 1.7.0
ORC-1701 Bump threeten-extra to 1.8.0
ORC-1702 Bump bcpkix-jdk18on to 1.78.1
ORC-1703 Bump maven-jar-plugin to 3.4.1
ORC-1705 Upgrade zstd-jni to 1.5.6-3
ORC-1712 Bump maven-shade-plugin to 3.5.3
ORC-1714 Bump commons-csv to 1.11.0
ORC-1715 Bump org.objenesis:objenesis to 3.3
ORC-1718 Upgrade build-helper-maven-plugin to 3.6.0
ORC-1723 Upgrade commons-cli to 1.8.0
ORC-1734 Bump maven-jar-plugin to 3.4.2
ORC-1748 Upgrade commons-lang3 to 3.15.0
ORC-1755 Bump commons-lang3 to 3.16.0
ORC-1757 Bump slf4j to 2.0.14
ORC-1759 Upgrade commons-cli to 1.9.0
ORC-1762 Bump slf4j to 2.0.16
ORC-1766 Upgrade brotli4j to 1.17.0
ORC-1768 Upgrade commons-lang3 to 3.17.0
ORC-1773 Bump reproducible-build-maven-plugin to 0.17
ORC-1775 Upgrade aircompressor to 2.0.2
ORC-1777 Upgrade protobuf-java to 3.25.5
ORC-1781 Upgrade zstd-jni to 1.5.6-6
ORC-1782 Upgrade Hadoop to 3.4.1
ORC-1784 Upgrade Maven to 3.9.9
ORC-1785 Upgrade commons-csv to 1.12.0
ORC-1791 Remove commons-lang3 dependency
ORC-1798 Upgrade maven-dependency-plugin to 3.8.1
ORC-1803 Upgrade zstd-jni to 1.5.6-7
ORC-1808 Upgrade zstd-jni to 1.5.6-8
ORC-1817 Upgrade brotli4j to 1.18.0
ORC-1825 [C++] Bump Snappy to 1.2.1
ORC-1827 [C++] Bump ZLIB to 1.3.1
ORC-1828 [C++] Bump LZ4 to 1.10.0

Documentation

ORC-642 Update PatchedBase doc with patch ceiling in spec
ORC-1634 Fix some outdated descriptions in Building ORC documentation
ORC-1668 Add merge command to Java tools documentation
ORC-1800 Upgrade bcpkix-jdk18on to 1.79
ORC-1814 Use Ubuntu 24.04/Jekyll 4.3/Rouge 4.5 to generate website
ORC-1815 Remove broken people.apache.org links
ORC-1819 Publish snapshot website through GitHub Pages
ORC-1824 Update Python documentation with PyArrow 18.1.0 and Task 2024.12.1
ORC-1830 Fix release table hyperlink to use baseurl

ORC 2.0.3 Released

release

14 Nov 2024 dongjoon

The ORC team is excited to announce the release of ORC v2.0.3.

Released: 14 November 2024
Source code: orc-2.0.3.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-2.0.3
Maven Central: ORC 2.0.3
SHA 256: 082cba862b5a8a0d…
Fixed issues: ORC-2.0.3

The bug fixes:

ORC-1796 [C++] Fix return wrong result if lack of has null

The test changes:

ORC-1680 Bump bcpkix-jdk18on to 1.78
ORC-1702 Bump bcpkix-jdk18on to 1.78.1
ORC-1756 Bump snappy-java to 1.1.10.6 in bench module
ORC-1770 Upgrade parquet to 1.14.2 in bench module
ORC-1776 Remove MacOS 12 from GitHub Action CI and docs
ORC-1778 Upgrade Spark to 4.0.0-preview2 in bench module
ORC-1783 Add MacOS 15 to GitHub Action MacOS CI and docs
ORC-1790 Upgrade parquet to 1.14.3 in bench module
ORC-1800 Upgrade bcpkix-jdk18on to 1.79

The build and dependency changes:

ORC-1608 Upgrade Hadoop to 3.4.0
ORC-1750 Bump protobuf-java to 3.25.4
ORC-1769 Upgrade zstd-jni to 1.5.6-5
ORC-1775 Upgrade aircompressor to 2.0.2
ORC-1777 Bump protobuf-java to 3.25.5
ORC-1781 Upgrade zstd-jni to 1.5.6-6
ORC-1782 Upgrade Hadoop to 3.4.1
ORC-1784 Upgrade Maven to 3.9.9
ORC-1785 Upgrade commons-csv to 1.12.0
ORC-1791 Remove commons-lang3 dependency

ORC 1.9.5 Released

release

14 Nov 2024 dongjoon

The ORC team is excited to announce the release of ORC v1.9.5.

Released: 14 November 2024
Source code: orc-1.9.5.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-1.9.5
Maven Central: ORC 1.9.5
SHA 256: 6900b4e8a2e4e492…
Fixed issues: ORC-1.9.5

The bug fixes:

ORC-1741 Respect decimal reader isRepeating flag

The test changes:

ORC-1792 Upgrade Spark to 3.5.3

ORC 1.8.8 Released

release

11 Nov 2024 wgtmac

The ORC team is excited to announce the release of ORC v1.8.8.

Released: 11 November 2024
Source code: orc-1.8.8.tar.gz
GPG Signature signed by Gang Wu (8A461DF4)
Git tag: rel/release-1.8.8
Maven Central: ORC 1.8.8
SHA 256: eca12a9139c0889d…
Fixed issues: ORC-1.8.8

The bug fixes:

ORC-1696: Fix ClassCastException when reading avro decimal type in benchmark
ORC-1738: [C++] Wrong Int128 maximum value

The test changes:

ORC-1793 Upgrade Spark to 3.4.4

The tasks:

ORC-1540 Remove MacOS 11 from GitHub Action CI

ORC 1.7.11 Released

release

13 Sep 2024 dongjoon

The ORC team is excited to announce the release of ORC v1.7.11.

Released: 13 September 2024
Source code: orc-1.7.11.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-1.7.11
Maven Central: ORC 1.7.11
SHA 256: ff62f0b882470529…
Fixed issues: ORC-1.7.11

The bug fixes:

ORC-1602 [C++] limit compression block size
ORC-1738 [C++] Fix wrong Int128 maximum value

The ‘tests’ fixes:

ORC-1540 Remove MacOS 11 from GitHub Action CI and docs
ORC-1556 Add Rocky Linux 9 Docker Test
ORC-1557 Add GitHub Action CI for Docker Test
ORC-1561 Remove Java11 and clang variants from docker/os-list.txt in branch-1.7
ORC-1578 Fix SparkBenchmark on sales data according to SPARK-40918
ORC-1696 Fix ClassCastException when reading avro decimal type in bechmark

ORC 2.0.2 Released

release

15 Aug 2024 dongjoon

The ORC team is excited to announce the release of ORC v2.0.2.

Released: 15 August 2024
Source code: orc-2.0.2.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-2.0.2
Maven Central: ORC 2.0.2
SHA 256: fabdee3e8acd64da…
Fixed issues: ORC-2.0.2

The improvements (tools):

ORC-1724 JsonFileDump utility should print user metadata
ORC-1740 Avoid the dump tool repeatedly parsing ColumnStatistics
ORC-1742 Support print the id, name and type of each column in dump tool

The bug fixes:

ORC-1732 [C++] Fix detecting Homebrew-installed Protobuf on MacOS
ORC-1733 [C++][CMake] Fix CMAKE_MODULE_PATH not to use PROJECT_SOURCE_DIR
ORC-1738 [C++] Fix wrong Int128 maximum value
ORC-1741 Respect decimal reader isRepeating flag
ORC-1749 Fix supportVectoredIO for hadoop version string with optional patch labels
ORC-1751 [C++] Fix syntax error in ThirdpartyToolchain

The test changes:

ORC-1694 Upgrade gson to 2.9.0 for Benchmarks Hive
ORC-1697 Fix IllegalArgumentException when reading json timestamp type in benchmark
ORC-1700 Write parquet decimal type data in Benchmark using FIXED_LEN_BYTE_ARRAY type
ORC-1743 Upgrade Spark to 4.0.0-preview1
ORC-1744 Add ubuntu-24.04 to GitHub Action
ORC-1746 Bump netty-all to 4.1.110.Final in bench module
ORC-1752 Fix NumberFormatException when reading json timestamp type in benchmark
ORC-1753 Use Avro 1.12.0 in bench module

The build and dependency changes:

ORC-1721 Upgrade aircompressor to 0.27
ORC-1747 Upgrade zstd-jni to 1.5.6-4

ORC 1.9.4 Released

release

16 Jul 2024 william

The ORC team is excited to announce the release of ORC v1.9.4.

Released: 16 July 2024
Source code: orc-1.9.4.tar.gz
GPG Signature signed by William Hyun (DECDFA29)
Git tag: rel/release-1.9.4
Maven Central: ORC 1.9.4
SHA 256: d9a6bcc00e07a6e5…
Fixed issues: ORC-1.9.4

The bug fixes:

ORC-1696 Fix ClassCastException when reading avro decimal type in bechmark
ORC-1721 Upgrade aircompressor to 0.27
ORC-1738 Wrong Int128 maximum value

The test changes:

ORC-1619 Add MacOS 14 to GitHub Action
ORC-1699 Fix SparkBenchmark in Parquet format according to SPARK-40918

The task changes:

ORC-1540 Remove MacOS 11 from GitHub Action CI

ORC 2.0.1 Released

release

14 May 2024 william

The ORC team is excited to announce the release of ORC v2.0.1.

Released: 14 May 2024
Source code: orc-2.0.1.tar.gz
GPG Signature signed by William Hyun (DECDFA29)
Git tag: rel/release-2.0.1
Maven Central: ORC 2.0.1
SHA 256: 1ffac0228aa83f04…
Fixed issues: ORC-2.0.1

The improvements (tools):

ORC-1644 Add merge tool to merge multiple ORC files into a single ORC file
ORC-1647 Tips for supporting ORC in the convert command
ORC-1667 Add check tool to check the index of the specified column

The bug fixes:

ORC-1646 Close the reader when reading the schema with the convert command
ORC-1654 [C++] Count up EvaluatedRowGroupCount correctly
ORC-1684 [C++] Find tzdb without TZDIR when in conda-environments
ORC-1688 [C++] Do not access TZDB if there is no timestamp type
ORC-1696 Fix ClassCastException when reading avro decimal type in bechmark The tasks:
ORC-1649 [C++][Conan] Add 2.0.0 to conan recipe and update release guide
ORC-1669 [C++] Deprecate HDFS support
ORC-1686 [C++] Avoid using std::filesystem

The test changes:

ORC-1648 Add test to convert ORC in the convert command
ORC-1663 [C++] Enable TestTimezone.testMissingTZDB on Windows
ORC-1672 Remove test packages o.a.o.tools.check
ORC-1673 Remove test packages o.a.o.tools.[count|merge|sizes]
ORC-1676 Use Hive 4.0.0 in benchmark
ORC-1681 Remove redundant import statement in tests to fix checkstyle failures
ORC-1699 Fix SparkBenchmark in Parquet format according to SPARK-40918
ORC-1704 Migration to Scala 2.13 of Apache Spark 3.5.1 at SparkBenchmark
ORC-1707 Fix sun.util.calendar IllegalAccessException when SparkBenchmark runs on JDK17
ORC-1708 Support data/compress options in Hive benchmark

The build and dependency changes:

ORC-1670 Upgrade zstd-jni to 1.5.6-1
ORC-1679 Bump zstd-jni to 1.5.6-2
ORC-1695 Upgrade gson to 2.10.1
ORC-1698 Upgrade commons-cli to 1.7.0
ORC-1705 Upgrade zstd-jni to 1.5.6-3
ORC-1714 Bump commons-csv to 1.11.0
ORC-1715 Bump org.objenesis:objenesis to 3.3

The documentation changes:

ORC-1668 Add merge command to Java tools documentation

Shaoyun Chen and Yuanping Wu added as committers

team

13 May 2024 gangwu

The ORC PMC is happy to add Shaoyun Chen and Yuanping Wu as committers for their work on ORC Java and C++ library.

Thank you for your work on ORC, Shaoyun and Yuanping!

ORC 1.8.7 Released

release

14 Apr 2024 dongjoon

The ORC team is excited to announce the release of ORC v1.8.7.

Released: 14 April 2024
Source code: orc-1.8.7.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-1.8.7
Maven Central: ORC 1.8.7
SHA 256: 57c9d12bf74b2752…
Fixed issues: ORC-1.8.7

The bug fixes:

ORC-1528: Fix readBytes potential overflow in RecordReaderUtils.ChunkReader#create
ORC-1602: [C++] limit compression block size

The test changes:

ORC-1556 Add Rocky Linux 9 Docker Test
ORC-1557 Add GitHub Action CI for Docker Test
ORC-1560 Remove Java11 and clang variants from docker/os-list.txt in branch-1.8
ORC-1562 Bump guava to 33.0.0-jre
ORC-1578 Fix SparkBenchmark on sales data according to SPARK-40918
ORC-1621 Switch to oraclelinux9 from rocky9

The documentations:

ORC-1536 Remove hive-storage-api link from maven-javadoc-plugin
ORC-1563 Fix orc.bloom.filter.fpp default value and orc.compress notes of Spark and Hive config docs

ORC 1.9.3 Released

release

20 Mar 2024 gangwu

The ORC team is excited to announce the release of ORC v1.9.3.

Released: 20 March 2024
Source code: orc-1.9.3.tar.gz
GPG Signature signed by Gang Wu (578F619B)
Git tag: rel/release-1.9.3
Maven Central: ORC 1.9.3
SHA 256: f737d005d0c4deb6…
Fixed issues: ORC-1.9.3

The bug fixes:

ORC-634 Fix the json output for double NaN and infinite
ORC-1553 Reading information from Row group, where there are 0 records of SArg column
ORC-1563 Fix orc.bloom.filter.fpp default value and orc.compress notes of Spark and Hive config docs
ORC-1578 Fix SparkBenchmark according to SPARK-40918
ORC-1586 Fix IllegalAccessError when SparkBenchmark runs on JDK17
ORC-1602 [C++] limit compression block size
ORC-1607 Fix testDoubleNaNAndInfinite to use TestFileDump.checkOutput
ORC-1609 Fix the compilation problem of TestJsonFileDump in branch 1.9

The test changes:

ORC-1556 Add Rocky Linux 9 Docker Test
ORC-1557 Add GitHub Action CI for Docker Test
ORC-1559 Remove Java11 and clang variants from docker/os-list.txt from branch-1.9

The tasks:

ORC-1532 Upgrade opencsv to 5.9
ORC-1536 Remove hive-storage-api link from maven-javadoc-plugin
ORC-1576 Upgrade spark.jackson.version to 2.15.2 in bench module
ORC-1591 Lower log level from INFO to DEBUG in *ReaderImpl/WriterImpl/PhysicalFsWriter
ORC-1592 Suppress KeyProvider missing log
ORC-1616 Upgrade aircompressor to 0.26
ORC-1618 Disable building tests for snappy

Documentation:

ORC-1535 Remove generated Java docs from source tree

ORC 2.0.0 Released

release

08 Mar 2024 dongjoon

The ORC team is excited to announce the release of ORC v2.0.0.

Released: 8 March 2024
Source code: orc-2.0.0.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-2.0.0
Maven Central: ORC 2.0.0
SHA 256: 9107730919c29eb3…
Fixed issues: ORC-2.0.0

New Feature and Notable Changes:

ORC-998: Refactor compression output buffer within OutStream for better portability
ORC-1088: Suport ZSTD_JNI and columnn compress to set compression level
ORC-1100: Support vcpkg
ORC-1251: Use Hadoop Vectored IO
ORC-1387: [C++] Support schema evolution from decimal to numeric/decimal
ORC-1440: Check for protobuf config based module
ORC-1463: Support brotli codec
ORC-1507: Use Zulu JDK distribution and switch from 21-ea to 21
ORC-1512: Drop Java 8/11 and make Java 17 by default
ORC-1531: Create orc-format module and repo
ORC-1545: Use orc-format 1.0.0-SNAPSHOT
ORC-1546: Use orc-format 1.0.0-alpha
ORC-1547: Spin-off ORC Format
ORC-1551: Use orc-format 1.0.0-beta
ORC-1572: Use Apache ORC Format 1.0.0
ORC-1585: [C++] Add orc-format_ep as a dependency of orc

Improvements:

ORC-1459: Mark DataBuffer::size() and DataBuffer::capacity() as const
ORC-1460: specification: Clarify how dictionary entries are sorted
ORC-1461: Mark Int128::getHighBits() and Int128::getLowBits() as const
ORC-1472: Replace deprecated method in TestMurmur3.java
ORC-1479: Enhance example usage message to use Uber jar
ORC-1481: [C++] Better error message when TZDB is unavailable
ORC-1504: Add lower bound check in get API for DynamicIntArray
ORC-1506: Replacing deprecated valueOf() with recommended forNumber()
ORC-1509: Auto grant contributor role to first-time contributors
ORC-1520: Remove JDK 8 settings from pom
ORC-1567: Add the -ignoreExtension configuration to the sizes and count commands of orc-tools
ORC-1570: Add supportVectoredIO API to HadoopShimsCurrent and use it
ORC-1571: Supports displaying raw data size in the meta command of orc-tools
ORC-1577: Use ZSTD as the default compression
ORC-1580: Change default DataBuffer constructor to use reserve instead of resize
ORC-1595: Add a short-cut to skip tiny inputs for ZstdCodec.compress
ORC-1596: Remove redundant Zstd.isError JNI usage
ORC-1597: Set bloom filter fpp to 1%
ORC-1600: Reduce getStaticMemoryManager sync block in OrcFile
ORC-1601: Reduce get HadoopShims sync block in HadoopShimsFactory
ORC-1610: Reduce the number of hash computation in CuckooSetBytes
ORC-1613: Zstd decompression supports direct buffer
ORC-1631: Supports summary output in sizes command
ORC-1637: [C++] Port conan recipe from upstream conan center
ORC-1638: Avoid System.exit(0) in count command
ORC-1639: [C++] Reduce unnecessary compiler flags in CMake
ORC-1641: Remove sourceFileExcludes from maven-javadoc-plugin
ORC-1642: Avoid System.exit(0) in scan command
ORC-1593: Set orc.compression.zstd.level to 3 by default

Bug Fixes:

ORC-634: Fix the json output for double NaN and infinite
ORC-1455: [C++] Fix build failure on non-x86 with unused macro in CpuInfoUtil.cc
ORC-1473: Zero-copy zeroCopyReadRanges and releaseBuffer bugs
ORC-1476: Maven build fail with unsupported platform: protoc-3.17.3-osx-aarch_64.exe
ORC-1480: [C++] Build failed when the BUILD_CPP_ENABLE_METRICS is ON
ORC-1500: [C++] The partition field does not support English special characters
ORC-1528: When using the orc.min.disk.seek.size configuration to read extremely large ORC files, a java.nio.BufferOverflowException may occur.
ORC-1553: Reading information from Row group, where there are 0 records of SArg column
ORC-1563: Fix orc.bloom.filter.fpp default value and orc.compress notes of Spark and Hive config docs
ORC-1568: Use readDiskRanges if orc.use.zerocopy is enabled
ORC-1575: Use ASF Archive URL instead Download URL
ORC-1578: Fix SparkBenchmark according to SPARK-40918
ORC-1588: Fix incorrect Decimal assert in LeafFilterFactory
ORC-1602: [C++] limit compression block size

Tasks:

ORC-1422: Setting version to 2.0.0-SNAPSHOT
ORC-1434: Remove org.apache.hadoop from dependabot.yml
ORC-1484: Use JIRA_ACCESS_TOKEN in merge_orc_pr.py
ORC-1485: Enable checkstyle checks for test classes
ORC-1486: Fix checkstyle violations for tests in orc-core module
ORC-1492: Fix checkstyle violations for tests in mapreduce, tools, bench modules
ORC-1496: Use iterator to suggest backporting branches
ORC-1515: Skip publishing orc-example module
ORC-1516: Fix minor typo in comments in IOUtils
ORC-1518: Remove findbugs folders
ORC-1529: Fix minor typos in pom.xml
ORC-1530: Rename variables in RecordReaderUtils.ChunkReader#create
ORC-1535: Remove generated Java docs from source tree
ORC-1536: Remove hive-storage-api link from maven-javadoc-plugin
ORC-1540: Remove MacOS 11 from GitHub Action CI
ORC-1542: Use Pattern Matching for instanceof (JEP-394)
ORC-1549: Update libhdfspp.tar.gz by adding #include <cstdint>
ORC-1569: Remove HadoopShimsPre2_3, HadoopShimsPre2_6, HadoopShimsPre2_7 classes
ORC-1579: Add ASF Generative Tooling Guidance to PR template
ORC-1591: Lower log level from INFO to DEBUG in *ReaderImpl/WriterImpl/PhysicalFsWriter
ORC-1592: Suppress KeyProvider missing log
ORC-1598: Close reader in orc-examples
ORC-1604: Deprecate non-utf8 bloom filter for Java writer

Tests:

ORC-1003: Recover java-examples-test
ORC-1409: Add stream order description in ORC spec.
ORC-1432: Add MacOS 13 GitHub Action Job
ORC-1474: Replaced deprecated getMinimum/Maximum in TestColumnStatistics
ORC-1475: [C++] ConvertColumnReader.TestConvertNumericToStringVariant fails when compiled with unsigned char
ORC-1477: Remove unused imports from Test classes
ORC-1478: Add Unit Test for org.apache.orc.impl.DynamicIntArray
ORC-1510: Fix package for TestOrcUtils and add more test cases
ORC-1541: Add Ubuntu 24.04 LTS Docker Test
ORC-1555: Simplify fedora37 docker image
ORC-1556: Add Rocky Linux 9 Docker Test
ORC-1557: Add GitHub Action CI for Docker Test
ORC-1558: Remove ubuntu22_jdk=21 and ubuntu22_jdk=21_cc=clang test combinations from docker/os-list.txt
ORC-1574: Update GitHub Action YAML files in branch-2.0
ORC-1586: Fix IllegalAccessError when SparkBenchmark runs on JDK17
ORC-1607: Fix testDoubleNaNAndInfinite to use TestFileDump.checkOutput
ORC-1614: Set ByteBuffer limit in TestBrotli test
ORC-1618: Disable building tests for snappy
ORC-1619: Add MacOS 14 to GitHub Action
ORC-1620: Add Apple Silicon Test Coverage
ORC-1621: Switch to oraclelinux9 from rocky9
ORC-1623: Use directOut.put(out) instead of directOut.put(out.array()) in TestZstd test
ORC-1630: Test using VectoredIO of hadoop to read ORC
ORC-1632: Add test for count command
ORC-1633: Add test for sizes command
ORC-1643: Add test for scan command

Build and dependency changes:

ORC-870: Unpin and upgrade jmh to 1.37
ORC-1423: Bump build-helper-maven-plugin to 3.4.0
ORC-1424: Bump maven-assembly-plugin to 3.6.0
ORC-1425: Bump checkstyle to 10.11.0
ORC-1427: Use Hadoop 3.3.5 in tools module
ORC-1429: Upgrade Maven to 3.8.8
ORC-1430: Use Hadoop 3.3.5 shaded clients
ORC-1431: Use parquet to 1.13.1 in bench module
ORC-1437: Bump checkstyle to 10.12.0
ORC-1438: Bump auto-service to 1.1.0
ORC-1439: Bump guava to 32.0.0-jre
ORC-1442: Update guava to 32.0.1
ORC-1445: Bump snappy-java to 1.1.10.1 in bench module
ORC-1448: Bump auto-service to 1.1.1
ORC-1456: Update Hadoop to 3.3.6
ORC-1466: Bump junit to 5.10.0
ORC-1467: Upgrade commons-lang3 to 3.13.0
ORC-1468: Bump opencsv to 5.8
ORC-1469: Update guava to 32.1.2
ORC-1470: Update maven-shade-plugin to 3.5.0
ORC-1493: Bump byte-buddy to 1.14.6
ORC-1502: Upgrade Maven to 3.9.4
ORC-1508: Upgrade slf4j to 2.0.9
ORC-1513: Upgrade snappy to 1.1.10.4
ORC-1514: Remove zookeeper runtime dependency
ORC-1517: Bump snappy-java to 1.1.10.5 in bench module
ORC-1521: Bump com.google.guava:guava to 32.1.3-jre
ORC-1522: Bump commons-cli:commons-cli to 1.6.0
ORC-1523: Bump maven-checkstyle-plugin to 3.3.1
ORC-1524: Bump maven-shade-plugin to 3.5.1
ORC-1526: Bump spotbugs-maven-plugin to 4.8.1.0
ORC-1527: Bump junit to 5.10.1
ORC-1533: Upgrade commons-lang3 to 3.14.0
ORC-1534: Upgrade build-helper-maven-plugin to 3.5.0
ORC-1537: Unpin and upgrade spotless to 2.41.0
ORC-1538: Unpin and upgrade maven-dependency-plugin to 3.6.1
ORC-1543: Bump spotless-maven-plugin to 2.41.1
ORC-1544: Unpin and upgrade protobuf-java to 3.25.1
ORC-1550: Upgrade Maven to 3.9.6
ORC-1562: Bump com.google.guava:guava to 33.0.0-jre
ORC-1565: Bump slf4j.version to 2.0.10
ORC-1566: Make Brotli dependency as optional
ORC-1576: Upgrade spark.jackson.version to 2.15.2 in bench module
ORC-1581: Bump slf4j.version to 2.0.11
ORC-1582: Bump protobuf-java to 3.25.2
ORC-1605: Upgrade brotli4j to 1.16.0
ORC-1616: Upgrade aircompressor to 0.26
ORC-1624: Upgrade Spark to 3.5.1
ORC-1626: Upgrade Mockito to 5.10 and byte-buddy to 1.14.11
ORC-1627: Unpin scala-library
ORC-1628: Bump protobuf-java to 3.25.3

Documentations:

ORC-994: Fix javadoc so that it doesn’t put files into the source tree
ORC-1471: Updated README.md to use maven 3.8.8
ORC-1491: Update Python documentation with PyArrow 13.0.0 and Dask 2023.8.1
ORC-1503: Update README.md to use maven 3.9.4
ORC-1552: Update README.md to use maven 3.9.6
ORC-1564: Add Java ORC configuration documentation
ORC-1584: Remove README about Proto subdirectory
ORC-1587: Fix usage command of SparkBenchmark document
ORC-1599: Add zstd compression level and windowlog in Java configuration documentation
ORC-1612: Document available encodings at orc.compress
ORC-1625: Switch to oraclelinux9 from rocky9 in README

Deshan Xiao added as committer

team

13 Jan 2024 dongjoon

The ORC PMC is happy to add Deshan Xiao as an ORC committer for the work on ORC Java Brotli codec and vcpkg C++ library.

Thank you for your work on ORC, Deshan!

ORC 1.9.2 Released

release

10 Nov 2023 dongjoon

The ORC team is excited to announce the release of ORC v1.9.2.

Released: 10 November 2023
Source code: orc-1.9.2.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-1.9.2
Maven Central: ORC 1.9.2
SHA 256: 7f46f2c184ecefd6…
Fixed issues: ORC-1.9.2

The bug fixes:

ORC-1475 [C++] Fix the failure of UT when char is unsigned
ORC-1480 [C++] Fix build break w/ BUILD_CPP_ENABLE_METRICS=ON
ORC-1482 Adaptation to read ORC files created by CUDF
ORC-1489 Assign a writer id to CUDF
ORC-1525 Fix bad read in RleDecoderV2::readByte

The test changes:

ORC-1431 Use parquet to 1.13.1 in bench module
ORC-1454 Update Spark to 3.4.1
ORC-1487 Enable checkstyle on src/test with checkstyle-suppressions.xml
ORC-1498 Add Debian 12 Docker test
ORC-1502 Upgrade Maven to 3.9.4
ORC-1505 Upgrade Spark to 3.5.0
ORC-1511 Bump Avro to 1.11.3 in bench module
ORC-1513 Upgrade snappy-java to 1.1.10.4 in bench module
ORC-1517 Bump snappy-java to 1.1.10.5 in bench module

The tasks:

ORC-1497 Bump maven-enforcer-plugin to 3.4.0
ORC-1499 Add MacOS 13 and 14 to building.md
ORC-1507 Use Zulu JDK distribution and switch from 21-ea to 21
ORC-1518 Remove findbugs folders

Documentation:

ORC-1503 Updated README.md with Maven version 3.9.4

ORC 1.8.6 Released

release

10 Nov 2023 dongjoon

The ORC team is excited to announce the release of ORC v1.8.6.

Released: 10 November 2023
Source code: orc-1.8.6.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-1.8.6
Maven Central: ORC 1.8.6
SHA 256: 5675b18118df4dd7…
Fixed issues: ORC-1.8.6

The bug fixes:

ORC-1525 Fix bad read in RleDecoderV2::readByte

The test changes:

ORC-1432 Add MacOS 13 GitHub Action Job

Documentations:

ORC-1499 Add MacOS 13 and 14 to building.md

ORC 1.7.10 Released

release

10 Nov 2023 dongjoon

The ORC team is excited to announce the release of ORC v1.7.10.

Released: 10 November 2023
Source code: orc-1.7.10.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-1.7.10
Maven Central: ORC 1.7.10
SHA 256: 85aef9368dc9bcdf…
Fixed issues: ORC-1.7.10

The bug fixes:

ORC-1304 [C++] Fix seeking over empty PRESENT stream
ORC-1413 Fix for ORC row level filter issue with ACID table

The task changes:

ORC-1482 Adaptation to read ORC files created by CUDF
ORC-1489 Assign a writer id to CUDF

ORC 1.8.5 Released

release

05 Sep 2023 gangwu

The ORC team is excited to announce the release of ORC v1.8.5.

Released: 5 September 2023
Source code: orc-1.8.5.tar.gz
GPG Signature signed by Gang Wu (578F619B)
Git tag: rel/release-1.8.5
Maven Central: ORC 1.8.5
SHA 256: 3dfb227d9810a3b6…
Fixed issues: ORC-1.8.5

The bug fixes:

ORC-1315: [C++] Byte to integer conversions fail on platforms with unsigned char type
ORC-1482: RecordReaderImpl.evaluatePredicateProto assumes floating point stats are always present

The tasks:

ORC-1489 Assign a writer id to CUDF

ORC 1.9.1 Released

release

16 Aug 2023 dongjoon

The ORC team is excited to announce the release of ORC v1.9.1.

Released: 16 August 2023
Source code: orc-1.9.1.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-1.9.1
Maven Central: ORC 1.9.1
SHA 256: 804492c6562516f9…
Fixed issues: ORC-1.9.1

The bug fixes:

ORC-1455 Fix build failure on non-x86 with unused macro in CpuInfoUtil.cc
ORC-1457 Fix ambiguous overload of Type::createRowBatch
ORC-1462 Bump aircompressor to 0.25 to fix JDK-8081450

The test changes:

ORC-1432 Add MacOS 13 GitHub Action Job
ORC-1464 Bump avro to 1.11.2
ORC-1465 Bump snappy-java to 1.1.10.3

ORC 1.9.0 Released

release

28 Jun 2023 dongjoon

The ORC team is excited to announce the release of ORC v1.9.0.

Released: 28 June 2023
Source code: orc-1.9.0.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-1.9.0
Maven Central: ORC 1.9.0
SHA 256: 0dca8bbccdb2ee87…
Fixed issues: ORC-1.9.0

New Feature and Notable Changes:

ORC-961: Expose metrics of the reader
ORC-1167: Support orc.row.batch.size configuration
ORC-1252: Expose io metrics for write operation
ORC-1301: Enforce C++ 17
ORC-1310: allowlist Support for plugin filter
ORC-1356: Use Intel AVX-512 instructions to accelerate the Rle-bit-packing decode
ORC-1385: Support schema evolution from numeric to numeric
ORC-1386: Support schema evolution from primitive to string group/decimal/timestamp

Improvements:

ORC-827: Utilize Array copyOf
ORC-1170: Optimize the RowReader::seekToRow function
ORC-1232 Disable metrics collector by default
ORC-1278 Update Readme.md cmake to 3.12
ORC-1279 Update cmake version
ORC-1286 Replace DataBuffer with BlockBuffer in the BufferedOutputStream
ORC-1298 Support dedicated ColumnVectorBatch of numeric types
ORC-1302 Upgrade Github workflow to build on Windows
ORC-1306 Fixed indented code style for Java modules
ORC-1307 Add coding style enforcement
ORC-1314 Remove macros defined before C++11
ORC-1347 Use make_unique and make_shared when creating unique_ptr and shared_ptr
ORC-1348 TimezoneImpl constructor should pass std::vector<> & instead of std::vector<>
ORC-1349 Remove useless bufStream definition
ORC-1352 Remove ORC_[NOEXCEPT|NULLPTR|OVERRIDE|UNIQUE_PTR] macro usages
ORC-1355 Writer::addUserMetadata change parameter to reference
ORC-1373 Add log when DynamicByteArray length overflow
ORC-1401 Allow writing an intermediate footer
ORC-1421 Use PyArrow 12.0.0 in document

The bug fixes:

ORC-1225 Bump maven-assembly-plugin to 3.4.2
ORC-1266 DecimalColumnVector resets the isRepeating flag in the nextVector method
ORC-1273 Bump opencsv to 5.7.0
ORC-1297 Bump opencsv to 5.7.1
ORC-1304 throw ParseError when using SearchArgument with nested struct
ORC-1315 Byte to integer conversions fail on platforms with unsigned char type
ORC-1320 Fix build break of C++ code on docker images
ORC-1363 Upgrade zookeeper to 3.8.1
ORC-1368 Bump commons-csv to 1.10.0
ORC-1398 Bump aircompressor to 0.24
ORC-1399 Fix boolean type with useTightNumericVector enabled
ORC-1433 Fix comment in the Vector.hh
ORC-1447 Fix a bug in CpuInfoUtil.cc to support ARM platform
ORC-1449 Add -Wno-unused-macros for Clang 14.0
ORC-1450 Stop enforcing override keyword
ORC-1453 Fix fall-through warning cases

The test changes:

ORC-1231 Update supported OS list in building.md
ORC-1233 Bump junit to 5.9.0
ORC-1234 Upgrade objenesis to 3.2 in Spark benchmark
ORC-1235 Bump avro to 1.11.1
ORC-1240 Update site README to use apache/orc-dev
ORC-1241 Use apache/orc-dev DockerHub repository in Docker tests
ORC-1250 Bump mockito to 4.7.0
ORC-1254 Add spotbugs check
ORC-1258 Bump byte-buddy to 1.12.14
ORC-1262 Bump maven-checkstyle-plugin to 3.2.0
ORC-1265 Upgrade spotbugs to 4.7.2
ORC-1267 Bump mockito to 4.8.0
ORC-1271 Bump spotbugs-maven-plugin to 4.7.2.0
ORC-1272 Bump byte-buddy to 1.12.16
ORC-1300 Update Spark to 3.3.1 and its dependencies
ORC-1303 Upgrade GoogleTest to 1.12.1
ORC-1318 Upgrade mockito.version to 4.9.0
ORC-1319 Upgrade byte-buddy to 1.12.19
ORC-1321 Bump checkstyle to 10.5.0
ORC-1322 Upgrade centos7 docker image to use gcc9
ORC-1324 Use Java 19 instead of 18 in GHA
ORC-1333 Bump mockito to 4.10.0
ORC-1341 Bump mockito to 4.11.0
ORC-1353 Bump byte-buddy to 1.12.21
ORC-1359 Bump byte-buddy to 1.12.22
ORC-1366 Bump checkstyle to 10.7.0
ORC-1367 Bump maven-enforcer-plugin to 3.2.1
ORC-1369 Bump byte-buddy to 1.12.23
ORC-1370 Bump snappy-java to 1.1.9.1
ORC-1374 Update Spark to 3.3.2
ORC-1379 Upgrade spotbugs to 4.7.3.2
ORC-1380 Upgrade checkstyle to 10.8.0
ORC-1394 Bump maven-assembly-plugin to 3.5.0
ORC-1397 Bump checkstyle to 10.9.2
ORC-1405 Bump spotbugs-maven-plugin to 4.7.3.4
ORC-1406 Bump maven-enforcer-plugin to 3.3.0
ORC-1408 Add testVectorBatchHasNull test case and comment
ORC-1415 Add Java 20 to GitHub Action CI
ORC-1417 Bump checkstyle to 10.10.0
ORC-1418 Bump junit to 5.9.3
ORC-1426 Use Java 21-ea instead of 20 in GitHub Action
ORC-1435 Bump maven-checkstyle-plugin to 3.3.0
ORC-1436 Bump snappy-java to 1.1.10.0
ORC-1452 Use the latest OS versions in variant tests

The tasks:

ORC-1164 Setting version to 1.9.0-SNAPSHOT
ORC-1218 Bump apache pom to 27
ORC-1219 Remove redundant toString
ORC-1237 Remove a wrong image link to article-footer.png
ORC-1239 Upgrade maven-shade-plugin to 3.3.0
ORC-1256 Publish test-jar to maven central
ORC-1259 Bump slf4j to 2.0.0
ORC-1269 Remove FindBugs
ORC-1270 Move opencsv dependency to the tools module.
ORC-1274 Add a checkstyle rule to ban starting LAND and LOR
ORC-1275 Bump maven-jar-plugin to 3.3.0
ORC-1276 Bump slf4j to 2.0.1
ORC-1277 Bump maven-shade-plugin to 3.4.0
ORC-1284 Add permissions to GitHub Action labeler
ORC-1296 Bump reproducible-build-maven-plugin to 0.16
ORC-1311 Bump maven-shade-plugin to 3.4.1
ORC-1316 Bump slf4j.version to 2.0.4
ORC-1334 Bump slf4j.version to 2.0.6
ORC-1335 Bump netty-all to 4.1.86.Final
ORC-1351 Update PR Labeler definition
ORC-1358 Use spotless to format pom files
ORC-1371 Remove unsupported SLF4J bindings from classpath
ORC-1372 Bump zstd to v1.5.4
ORC-1375 Cancel old running ci tasks when a pr has a new commit
ORC-1377 Enforce override keyword
ORC-1383 Upgrade aircompressor to 0.22
ORC-1395 Enforce license check
ORC-1396 Bump slf4j to 2.0.7
ORC-1410 Bump zstd to v1.5.5
ORC-1411 Remove Ubuntu18.04 from docker-based tests
ORC-1419 Bump protobuf-java to 3.22.3
ORC-1428 Setup GitHub Action CI on branch-1.9
ORC-1443 Enforce Java version
ORC-1444 Enforce JDK Bytecode version
ORC-1446 Publish snapshot from branch-1.9

ORC 1.8.4 Released

release

14 Jun 2023 yqzhang

The ORC team is excited to announce the release of ORC v1.8.4.

Released: 14 June 2023
Source code: orc-1.8.4.tar.gz
GPG Signature signed by Yiqun Zhang (42E05C03)
Git tag: rel/release-1.8.4
Maven Central: ORC 1.8.4
SHA 256: 1a4400c1daea0997…
Fixed issues: ORC-1.8.4

The bug fixes:

ORC-1304: [C++] Fix seeking over empty PRESENT stream
ORC-1400: Use Hadoop 3.3.5 on Java 17+ and benchmark
ORC-1413: Fix for ORC row level filter issue with ACID table

The test changes:

ORC-1404 Bump parquet to 1.13.0
ORC-1414 Upgrade java bench module to spark3.4
ORC-1416 Upgrade Jackson dependency to 2.14.2 in bench module
ORC-1420 Pin net.bytebuddy package to 1.12.x

The tasks:

ORC-1395 Enforce license check via github action

ORC 1.7.9 Released

release

07 May 2023 gangwu

The ORC team is excited to announce the release of ORC v1.7.9.

Released: 7 May 2023
Source code: orc-1.7.9.tar.gz
GPG Signature signed by Gang Wu (578F619B)
Git tag: rel/release-1.7.9
Maven Central: ORC 1.7.9
SHA 256: fa0a6ed34c1c9673…
Fixed issues: ORC-1.7.9

The bug fixes:

ORC-1382 Fix secondary config names org.sarg.* to orc.sarg.*
ORC-1395 Enforce license check
ORC-1407 Upgrade cyclonedx-maven-plugin to 2.7.6

The test changes:

ORC-1374 Update Spark to 3.3.2

ORC 1.8.3 Released

release

15 Mar 2023 dongjoon

The ORC team is excited to announce the release of ORC v1.8.3.

Released: 15 March 2023
Source code: orc-1.8.3.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-1.8.3
Maven Central: ORC 1.8.3
SHA 256: a78678ec425c8129…
Fixed issues: ORC-1.8.3

The bug fixes:

ORC-1357: Handle missing compression block size
ORC-1382: Fix secondary config names org.sarg.* to orc.sarg.*
ORC-1384: Fix ArrayIndexOutOfBoundsException when reading dictionary stream bigger then dictionary
ORC-1393: Add reset(DiskRangeList input, long length) to InStream impl class

The test changes:

ORC-1360 Pin mockito to 4.x
ORC-1364 Pin spotless to 2.30.0
ORC-1374 Update Spark to 3.3.2

The tasks:

ORC-1358 Use spotless to format pom files

Xin Zhang added as committer

team

13 Feb 2023 gangwu

The ORC PMC is happy to add Xin Zhang as an ORC committer for the work on ORC C++ library.

Thank you for your work on ORC, Xin!

ORC 1.7.8 Released

release

21 Jan 2023 william

The ORC team is excited to announce the release of ORC v1.7.8.

Released: 21 January 2023
Source code: orc-1.7.8.tar.gz
GPG Signature signed by William Hyun (DECDFA29)
Git tag: rel/release-1.7.8
Maven Central: ORC 1.7.8
SHA 256: 4e92db5380d6596e…
Fixed issues: ORC-1.7.8

The improvements:

ORC-1342 Publish SBOM artifacts
ORC-1344 Skip SBOM generation during CMake
ORC-1345 Use makeBom and skip snapshot check in GitHub Action publish_snapshot job

The bug fixes:

ORC-1332 Avoid NegativeArraySizeException when using searchArgument
ORC-1343 Ignore orc.create.index

The test changes:

ORC-1323 Make docker/reinit.sh support target OS arguments

ORC 1.8.2 Released

release

13 Jan 2023 dongjoon

The ORC team is excited to announce the release of ORC v1.8.2.

Released: 13 January 2023
Source code: orc-1.8.2.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-1.8.2
Maven Central: ORC 1.8.2
SHA 256: 5e14501212abcb73…
Fixed issues: ORC-1.8.2

The bug fixes:

ORC-1332 Avoid NegativeArraySizeException when using searchArgument
ORC-1343 Disable ENABLE_INDEXES

The improvements:

ORC-1327 Exclude the proto files from the nohive jar
ORC-1328 Exclude the proto files from the shaded protobuf jar
ORC-1329 Add OrcConf.getStringAsList method
ORC-1338 Set bloom filter fpp to 1%
ORC-1342 Publish SBOM artifacts
ORC-1344 Skip SBOM generation during CMake
ORC-1345 Use makeBom and skip snapshot check in GitHub Action publish_snapshot job

The test changes:

ORC-1323 Make docker/reinit.sh support target OS arguments
ORC-1330 Add TestOrcConf
ORC-1339 Remove orc.sarg.to.filter default value assumption in test cases
ORC-1350 Upgrade setup-java to v3

The tasks:

ORC-1331 Improve PyArrow page
ORC-1336 Protect .asf.yaml, api, ORC-Deep-Dive-2020.pptx files in website
ORC-1337 Make .htaccess up to date
MINOR: Add .swp to .gitignore
MINOR: Link to Apache ORC orc_proto instead of Hive one
MINOR: Update DOAP file

ORC 1.8.1 Released

release

02 Dec 2022 dongjoon

The ORC team is excited to announce the release of ORC v1.8.1.

Released: 2 December 2022
Source code: orc-1.8.1.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-1.8.1
Maven Central: ORC 1.8.1
SHA 256: ba5877bd737e1fbc…
Fixed issues: ORC-1.8.1

The bug fixes:

ORC-1283 ENABLE_INDEXES does not take effect
ORC-1288 Invalid memory freeing with ZLIB compression
ORC-1291 NullPointerException in TypeDescription

The improvements:

ORC-1268 Set CMP0135 policy for CMake 3.24+
ORC-1282 Add slf4j impl to avoid warning message
ORC-1294 Build error when skip tests build
ORC-1295 Improve ORC Spec example (Decoding RLE v2 direct)
ORC-1299 benchmark can’t work for data resource 403
ORC-1305 Add more orc java examples
ORC-1308 Avoid star import

The test changes:

ORC-1290 Bump spotbugs to 4.7.3
ORC-1300 Update Spark to 3.3.1 and its dependencies

The tasks:

ORC-1269 Remove FindBugs
ORC-1270 Move opencsv dependency to the tools module
ORC-1292 Add paragraph in java documentation

ORC 1.7.7 Released

release

17 Nov 2022 dongjoon

The ORC team is excited to announce the release of ORC v1.7.7.

Released: 17 November 2022
Source code: orc-1.7.7.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-1.7.7
Maven Central: ORC 1.7.7
SHA 256: 52cbcd892c0bf07c…
Fixed issues: ORC-1.7.7

The bug fixes:

ORC-1283 ENABLE_INDEXES does not take effect

The test changes:

ORC-1254 Add spotbugs check
ORC-1299 Fix fetch data error in bench module

The tasks:

ORC-1256 Publish tests jar to maven central
ORC-1268 Set CMP0135 policy for CMake 3.24+

William Hyun elected as Chair

team

21 Sep 2022 dongjoon

The Apache ORC Project Management Committee (PMC) elected William Hyun as the Chair on September 12nd and Apache Software Foundation (ASF) Board approved it and appointed him as Vice President for Apache ORC on September 21st.

William has been leading many areas. He helped Apache ORC PMC add a new member, served as a release manager for 1.7.4/1.7.5/1.7.6/1.8.0, made an important contribution on inter-ASF project collaboration and ORC integration across several projects to help all ORC users, improved ORC infra like ASF ORC DockerHub Setup, docker tests, and GitHub Action, and revamped user experiences through updating websites and Homebrew.

ORC 1.8.0 Released

release

03 Sep 2022 william

The ORC team is excited to announce the release of ORC v1.8.0.

Released: 3 September 2022
Source code: orc-1.8.0.tar.gz
GPG Signature signed by William Hyun (DECDFA29)
Git tag: rel/release-1.8.0
Maven Central: ORC 1.8.0
SHA 256: 859d78bfded98405…
Fixed issues: ORC-1.8.0

New Feature and Notable Changes:

ORC-450 Support selecting list indices without materializing list items
ORC-824 Add column statistics for List and Map
ORC-1004 Java ORC writer supports the selection vector
ORC-1075 Support reading ORC files with no column statistics
ORC-1125 Support decoding decimals in RLE
ORC-1136 Optimize reads by combining multiple reads without significant separation into a single read
ORC-1138 Seek vs Read Optimization
ORC-1172 Add row count limit config for one stripe
ORC-1212 Upgrade protobuf-java to 3.17.3
ORC-1220 Set min.hadoop.version to 2.7.3
ORC-1248 Redefine Hadoop dependency for Apache ORC 1.8.0
ORC-1256 Publish test-jar to maven central
ORC-1260 Publish shaded-protobuf classifier artifacts

Improvements:

ORC-825 Use Empty Array For Collections toArray
ORC-826 Do Not Use Collection Contains/Get
ORC-828 Improve Fetch Data Set Process
ORC-829 Optimize Serialization percentileBits
ORC-831 Do Not Copy String When Flushing Dictionary
ORC-833 RunLengthIntegerReaderV2 Calculate Batch Size Once
ORC-834 Do Not Convert to String in DecimalFromTimestampTreeReader
ORC-835 Cache TRUE/FALSE Bytes in StringGroupFromBooleanTreeReader
ORC-836 StringGroupFromDoubleTreeReader Use Double toString
ORC-837 Reuse HiveDecimalWritable in ConvertTreeReaderFactory
ORC-838 Simplify compareTo/equals/putBuffer of ByteBufferAllocatorPool
ORC-840 Remove Superfluous Array Fill in RecordReaderImpl
ORC-841 Remove Superfluous Array Fill in StringHashTableDictionary
ORC-842 Remove newKey from StringHashTableDictionary
ORC-844 Improve hashCode Methods
ORC-847 Do Not Create Empty Array in StringGroupFromBinaryTreeReader
ORC-852 Allow DynamicByteArray to Return a ByteBuffer
ORC-853 Optimize writeDouble Implementation
ORC-855 Remove Unused isRepeating from RunLengthIntegerReaderV2
ORC-865 Bump opencsv from 3.9 to 5.5.1
ORC-883 Dependency Audit and QA
ORC-897 optimization loop termination condition in readerIsCompatible method
ORC-935 Bump commons-csv from 1.8 to 1.9.0
ORC-937 Replace deprecated method
ORC-958 Convert command support overwrite option
ORC-969 Evaluate SearchArguments using file and stripe level stats
ORC-975 Avoid double counting closestFixedBits in percentileBits method
ORC-982 Extract checkstyle to a single file, help newcomers check code style
ORC-988 Bump opencsv from 5.5.1 to 5.5.2
ORC-992 Reached max repeat length, we can directly decide to use DELTA encoding
ORC-1005 Make that the java and C++ implementations of determineEncoding in RunLengthIntegerWriterV2 are consistent.
ORC-1007 Fix a warning from the shade plugin
ORC-1013 Renaming a parameter in constructors of TreeWriter’s derived classes
ORC-1014 Add details when we get IOExceptions from file system
ORC-1020 Improve orc::RleDecoderV2::nextDirect
ORC-1027 Filter processing to allow filter injections that cannot be represented via SArgs
ORC-1047 Handle quoted field names during string schema parsing
ORC-1077 Remove commons-codec dependency and use java.util.Base64
ORC-1099 Extend ReadIntent to support MAP and UNION type
ORC-1101 Improve malformed STRUCT handling
ORC-1122 Add buffer to decode the whole run in RleDecoderV2
ORC-1137 Improve float/double conversion in DoubleColumnReader::next()
ORC-1149 Bump slf4j.version to 1.7.36
ORC-1150 Improve RowReaderImpl::computeBatchSize()
ORC-1152 Support encoding short decimals in RLEv2
ORC-1156 Update opencsv to 5.6
ORC-1163 Bump zookeeper from 3.7.0 to 3.8.0
ORC-1169 Use Hadoop 3.3.2 on Java 17+
ORC-1178 Use hadoop 3.3.3 on Java 17+

ORC 1.7.6 Released

release

17 Aug 2022 william

The ORC team is excited to announce the release of ORC v1.7.6.

Released: 17 August 2022
Source code: orc-1.7.6.tar.gz
GPG Signature signed by William Hyun (DECDFA29)
Git tag: rel/release-1.7.6
Maven Central: ORC 1.7.6
SHA 256: a75e0cccaaf5e03f…
Fixed issues: ORC-1.7.6

The bug fixes:

ORC-1204 ORC MapReduce writer to flush when long arrays
ORC-1205 nextVector should invoke ensureSize when reusing vectors
ORC-1215 Remove a wrong NotNull annotation on value of setAttribute
ORC-1222 Upgrade tools.hadoop.version to 2.10.2
ORC-1227 Use Constructor.newInstance instead of Class.newInstance
ORC-1228 Fix setAttribute to handle null value

The test changes:

ORC-932 Bump byte-buddy from 1.10.19 to 1.11.12 (#842)
ORC-1169 Use Hadoop to 3.3.2 on Java 17+ (#1113)
ORC-1178 Use Hadoop 3.3.3 on Java 17+ (#1129)
ORC-1193 Bump parquet.version to 1.12.3
ORC-1207 Upgrade Spark to 3.3.0
ORC-1210 Upgrade maven to 3.8.6
ORC-1234 Upgrade objenesis to 3.2 in Spark benchmark
ORC-1235 Bump avro.version to 1.11.1
ORC-1240 Update site README to use apache/orc-dev DockerHub image
ORC-1241 Use apache/orc-dev DockerHub repository in Docker tests
ORC-1244 Upgrade byte-buddy to 1.12.13 in branch-1.7
ORC-1245 Use Hadoop 3.3.4 on Java 17+ and benchmark

The documentation changes:

MINOR: Update DOAP with new releases (#1127)
ORC-900 Update doap_orc.rdf for Apache Projects page (#806)
ORC-1231 Update supported OS list in building.md
ORC-1237 Remove a wrong image link to article-footer.png
ORC-1238 Update DOAP with 1.7.5

The tasks:

ORC-1185 Add merge_orc_pr.py
ORC-1187 Use main instead of master in merge_orc_pr.py
ORC-1213 Use https in ThirdpartyToolchain.cmake
ORC-1226 Add a deprecation warning for Hadoop 2.7.2 and below

ORC 1.7.5 Released

release

16 Jun 2022 william

The ORC team is excited to announce the release of ORC v1.7.5.

Released: 16 June 2022
Source code: orc-1.7.5.tar.gz
GPG Signature signed by William Hyun (DECDFA29)
Git tag: rel/release-1.7.5
Maven Central: ORC 1.7.5
SHA 256: b90cae5853e3ea0e…
Fixed issues: ORC-1.7.5

The bug fixes:

ORC-1151 Fix ColumnWriter for non-UTC Timestamp columns
ORC-1160 Fix seekToRow can’t seek within selected row group
ORC-1133 Fix csv-import tool options
ORC-1183 Upgrade gson to 2.9.0
ORC-1186 Limit family in aarch64 profile
ORC-1188 Fix ORC_PREFER_STATIC_ZLIB

The improvements:

ORC-1198 Add a new PhysicalFsWriter constructor with FSDataOutputStream parameter
ORC-1199 Use Google mirror of Maven Central as the primary

The test changes:

ORC-1155 Add Ubuntu 22.04 to docker tests
ORC-1154 Bump hive.version from 3.1.2 to 3.1.3
ORC-1161 Add MacOS 12 and remove MacOS 10
ORC-1174 Add Ubuntu 22.04 to GitHub Action
ORC-1182 Use slf4j-simple instead of deprecated slf4j-log4j12
ORC-1184 Use Hadoop 3.3.3 in benchmark module
ORC-1189 Update README.md and help command message in benchmark module and .gitignore
ORC-1190 Fix ORCWriterBenchMark dumpDir initialization
ORC-1191 Updated TLC Taxi Benchmark Dataset
ORC-1192 Use orc.zstd instead of orc.none
ORC-1196 Add Spark benchmark integration tests to GHA
ORC-1201 Remove Debian 9 from Docker Tests

The documentation changes:

Add ASF verification instruction link

Pavan Lanka added as committer

team

05 Jun 2022 dongjoon

The ORC PMC is happy to add Pavan Lanka as an ORC committer for the work on introducing LazyIO of non-filter columns and optimizing stripe index and data reads.

Thank you for your work on ORC, Pavan!

ORC adds Yiqun Zhang to PMC

team

08 May 2022 william

The Apache ORC Project Management Committee (PMC) is happy to announce that Yiqun Zhang has joined us as a new member of the PMC. Yiqun has been showing consistent contributions as a committer, and participated in both major and maintenance releases by actively helping the release managers with testing the release candidates.

Please welcome Yiqun to the ORC PMC!

ORC 1.7.4 Released

release

15 Apr 2022 william

The ORC team is excited to announce the release of ORC v1.7.4.

Released: 15 April 2022
Source code: orc-1.7.4.tar.gz
GPG Signature signed by William Hyun (DECDFA29)
Git tag: rel/release-1.7.4
Maven Central: ORC 1.7.4
SHA 256: 0a70c5e877b1ff26…
Fixed issues: ORC-1.7.4

The bug fixes:

ORC-1120 Remove C++ library limitation about write version
ORC-1121 Fix column conversion check bug which causes column filters don’t work
ORC-1127 Add missing version of UNSTABLE-PRE-2.0
ORC-1146 Float category missing check if the statistic sum is a finite value
ORC-1147 Use isNaN instead of isFinite to determine the contain NaN values

The improvements:

ORC-236 Support UNION type in Java Convert tool
ORC-1116 Fix csv-import tool when exporting long bytes
ORC-1123 Add estimationMemory method for writer

The test changes:

ORC-1145 Add Java 18 to GitHub Action CI
ORC-1118 Support Java 17 and ARM64 docker tests

The documentation changes:

ORC-1117 Add Dask page at Using in Python section
ORC-1119 Remove timestamp from ORC API docs

ORC 1.6.14 Released

release

14 Apr 2022 dongjoon

The ORC team is excited to announce the release of ORC v1.6.14.

Released: 14 April 2022
Source code: orc-1.6.14.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-1.6.14
Maven Central: ORC 1.6.14
SHA 256: f2701d27d197a0b4…
Fixed issues: ORC-1.6.14

The bug fixes:

ORC-1121 Fix column coversion check bug which causes column filters don’t work
ORC-1146 Float category missing check if the statistic sum is a finite value
ORC-1147 Use isNaN instead of isFinite to determine the contain NaN values

The ‘tests’ fixes:

ORC-1016 Use openssl@1.1 in GitHub Action MacOS CIs
ORC-1113 Remove CentOS 8 from docker-based tests

Quanlong Huang added as committer

team

05 Mar 2022 gangwu

The ORC PMC is happy to add Quanlong Huang as an ORC committer for the work on ORC C++ library and Apache Impala integration.

Thank you for your work on ORC, Quanlong!

ORC 1.7.3 Released

release

09 Feb 2022 dongjoon

The ORC team is excited to announce the release of ORC v1.7.3.

Released: 9 February 2022
Source code: orc-1.7.3.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-1.7.3
Maven Central: ORC 1.7.3
SHA 256: 535c4d7588172e85…
Fixed issues: ORC-1.7.3

The ‘bug’ fixes:

ORC-1060 Reduce memory usage when vectorized reading dictionary string encoding columns
ORC-1065 Fix IndexOutOfBoundsException in ReaderImpl.extractFileTail
ORC-1067 [C++] Upgrade ZSTD to 1.5.1
ORC-1078 Row group end offset doesn’t accommodate all the blocks
ORC-1081 Fix heap-use-after-free in SearchArgumentBuilderImpl::end()
ORC-1087 [C++] Handle unloaded seek positions when seeking in an uncompressed chunk
ORC-1092 [C++] Upgrade LZ4 to version 1.9.3
ORC-1102 [C++] Upgrade ZSTD to 1.5.2

The ‘tools’ improvements:

ORC-1055 [C++] Add the timezone option for the csv-import tool
ORC-1082 Improve FileDump and JsonFileDump to be robust on missing column statistics
ORC-1092 [C++] Support specifying type ids or column names in cpp tools

The ‘documentation’ patches:

ORC-1050 Update ORC site README.md and release process page
ORC-1069 Update building.md
ORC-1071 Update ‘adopters’ page
ORC-1091 Add ‘Tests’ section at ORC ‘develop’ page
ORC-1112 Add ‘Using with Python’ web page
ORC-1114 Update ‘Using with Python’ page with ‘PyArrow’ 7.0.0

The ‘task’ patches:

ORC-1070 Upgrade site docker image to use Ubuntu 20.04
ORC-1072 Add ‘Stale’ GitHub Action job
ORC-1094 Enable GitHub issues tab
ORC-1095 Deprecate ‘UnknownFormatException’

The ‘tests’ fixes:

ORC-875 Add GitHub Action job for Windows Server 2019
ORC-878 Bump auto-service from 1.0-rc7 to 1.0
ORC-881 Bump slf4j.version from 1.7.30 to 1.7.32
ORC-989 Bump checkstyle from 8.45.1 to 9.0
ORC-993 Bump junit.version from 5.7.2 to 5.8.0
ORC-1018 Bump checkstyle from 9.0 to 9.0.1
ORC-1033 Bump junit.version from 5.8.0 to 5.8.1
ORC-1044 Bump reproducible-build-maven-plugin to 0.14
ORC-1048 Bump checkstyle from 9.0.1 to 9.1
ORC-1052 Bump avro.version from 1.10.2 to 1.11.0
ORC-1057 Bump junit.version from 5.8.1 to 5.8.2
ORC-1061 Bump checkstyle from 9.1 to 9.2
ORC-1066 Bump guava from 30.1.1-jre to 31.0.1-jre
ORC-1068 [C++] Stabilize HAS_POST_2038 test
ORC-1073 Remove appveyor.yml
ORC-1076 Remove Travis PR Builder Link from README.md
ORC-1079 Add Linux Clang 11 GitHub Action test coverage
ORC-1080 Remove .travis.yml
ORC-1084 Bump checkstyle from 9.2 to 9.2.1
ORC-1086 Bump reproducible-build-maven-plugin from 0.14 to 0.15
ORC-1090 Disable Clang 13.0-specific compilation warnings
ORC-1093 Remove debian8 specific code in run-one.sh
ORC-1096 Bump slf4j.version to 1.7.33
ORC-1103 Use Maven 3.8.4
ORC-1104 Use Spark 3.2.1 in benchmark
ORC-1105 fetch-data.sh should use zsh instead of bash
ORC-1106 Use transitive commons-lang3 dependency in bench module
ORC-1107 Fix NPE at benchmark data schema loading
ORC-1108 Use RawLocalFileSystem to skip checksum files during benchmark data generation
ORC-1109 Use zstd instead of none in the default compress option
ORC-1111 Bump build-helper-maven-plugin from 3.2.0 to 3.3.0
ORC-1113 Remove CentOS 8 from docker-based tests
ORC-1115 Suppress Illegal reflective access warnings on Java9+ Tests

ORC 1.6.13 Released

release

20 Jan 2022 dongjoon

The ORC team is excited to announce the release of ORC v1.6.13.

Released: 20 January 2022
Source code: orc-1.6.13.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-1.6.13
Maven Central: ORC 1.6.13
SHA 256: ff69f9e0b5b01dfc…
Fixed issues: ORC-1.6.13

The bug fixes:

ORC-1065 Fix IndexOutOfBoundsException in ReaderImpl.extractFileTail
ORC-1078 Row group end offset doesn’t accommodate all the blocks

The ‘tests’ fixes:

ORC-875 Add GitHub Action job for Windows Server 2019
ORC-941 Move MacOS 10.15/11.5 test from Travis to GitHub Action
ORC-1079 Add Linux Clang 11 GitHub Action test coverage
ORC-1080 Remove .travis.yml

ORC 1.7.2 Released

release

20 Dec 2021 dongjoon

The ORC team is excited to announce the release of ORC v1.7.2.

Released: 20 December 2021
Source code: orc-1.7.2.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-1.7.2
Maven Central: ORC 1.7.2
SHA 256: ef39bae755116fec…
Fixed issues: ORC-1.7.2

The bug fixes:

ORC-492 Avoid potential ArrayIndexOutOfBoundsException when getting WriterVersionn
ORC-1053 Fix time zone offset precision when convert tool converts LocalDateTime to Timestamp is not consistent with the internal default precision of ORC
ORC-1041 Use memcpy during LZO decompression
ORC-1059 Align findColumns behaviour between 1.6 and 1.7 release

The ‘tools’ improvements:

ORC-1012 Support specifying columns in orc-scan
ORC-1017 Add sizes tool to determine and display the sizes of each column in a set of files
ORC-1023 Support writing bloom filters in ConvertTool

The ‘tests’ fixes:

ORC-915 Remove io.netty.netty from Spark benchmark
ORC-938 Bump netty-all from 4.1.42.Final to 4.1.66.Final
ORC-948 Add hive benchmark integration tests
ORC-957 Bump netty-all from 4.1.66.Final to 4.1.67.Final
ORC-1021 Add -fno-omit-frame-pointer in DEBUG and RELWITHDEBINFO builds
ORC-1051 Update benchmark dependencies

Yiqun Zhang added as committer

team

23 Nov 2021 dongjoon

The ORC PMC is happy to add Yiqun Zhang as an ORC committer for the work on improving ORC tools.

Thank you for your work on ORC, Yiqun!

ORC 1.7.1 Released

release

07 Nov 2021 dongjoon

The ORC team is excited to announce the release of ORC v1.7.1.

Released: 7 November 2021
Source code: orc-1.7.1.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-1.7.1
Maven Central: ORC 1.7.1
SHA 256: 65d71e571238cbcb…
Fixed issues: ORC-1.7.1

The bug fixes of ORC 1.7:

ORC-879 Flaky Test for TestJsonReader
ORC-1000 Use Java 17 in GitHub Action
ORC-1002 Add java17 profile for Java17 unit testing
ORC-1008 Overflow detection code is incorrect in IntegerColumnStatisticsImpl
ORC-1009 [C++] Missing string include causes build failure with MSVC++
ORC-1010 Bump tzdata from tzdata-2020e-1.tar.xz to tzdata-2021b-1.tar.xz
ORC-1011 Activate java17 profile automatically
ORC-1015 Update OrcFile.WriterOptions::memory javadoc
ORC-1016 Use openssl@1.1 in GitHub Action MacOS CIs
ORC-1024 BloomFilter hash computation is inconsistent between Java and C++ clients
ORC-1029 Could not load ‘org.apache.orc.DataMask.Provider’ when using orc encryption and spark executor with multi cores!
ORC-1030 Java Tools Recover File command does not accurately find OrcFile.MAGIC
ORC-1032 Bump parquet.version from 1.12.0 to 1.12.2
ORC-1034 The search byte array algorithm is incorrectly implemented in FileDump.java
ORC-1035 backupDataPath may be incorrect in recoverFile
ORC-1036 Due to tzdata upgrade, the fixed download links in CI are often not working
ORC-1037 Bump spark.version from 3.1.2 to 3.2.0
ORC-1039 Make FileDump.recoverFile handle side files only if they exist
ORC-1040 Add Debian 11 docker test
ORC-1042 Ignore unused-function C++ compile warning on CentOS 7
ORC-1043 Fix C++ conversion compilation error in CentOS 7

ORC 1.6.12 Released

release

07 Nov 2021 dongjoon

The ORC team is excited to announce the release of ORC v1.6.12.

Released: 7 November 2021
Source code: orc-1.6.12.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-1.6.12
Maven Central: ORC 1.6.12
SHA 256: ff69f9e0b5b01dfc…
Fixed issues: ORC-1.6.12

The bug fixes of ORC 1.6.12:

ORC-1008 Overflow detection code is incorrect in IntegerColumnStatisticsImpl
ORC-1010 Bump tzdata from tzdata-2020e-1.tar.xz to tzdata-2021b-1.tar.xz
ORC-1024 BloomFilter hash computation is inconsistent between Java and C++ clients
ORC-1029 Could not load ‘org.apache.orc.DataMask.Provider’ when using orc encryption and spark executor with multi cores!
ORC-1034 The search byte array algorithm is incorrectly implemented in FileDump.java
ORC-1035 backupDataPath may be incorrect in recoverFile
ORC-1036 Due to tzdata upgrade, the fixed download links in CI are often not working
ORC-1040 Add Debian 11 docker test
ORC-1042 Ignore unused-function C++ compile warning on CentOS 7
ORC-1043 Fix C++ conversion compilation error in CentOS 7

ORC adds William Hyun to PMC

team

02 Oct 2021 dongjoon

On behalf of the Apache ORC Project Management Committee (PMC), it gives me great pleasure to announce that William Hyun has joined the PMC. William has led several areas including Java 17/Apple Silicon support, Java Tools improvement, Code quality improvement using static analysis, CI/Docker test coverage improvement, and Apache ORC 1.7 migration support at Apache Arrow/Druid/Iceberg.

Please join me in welcoming William to the ORC PMC!

ORC 1.7.0 Released

release

15 Sep 2021 dongjoon

The ORC team is excited to announce the release of ORC v1.7.0.

Released: 15 September 2021
Source code: orc-1.7.0.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-1.7.0
Maven Central: ORC 1.7.0
SHA 256: cab8cea768b391b1…
Fixed issues: ORC-1.7.0

The new features of ORC 1.7:

ORC-377 Support Snappy compression in C++ Writer
ORC-577 Support row-level filtering
ORC-716 Build and test on Java 17-EA
ORC-731 Improve Java Tools
ORC-742 LazyIO of non-filter columns
ORC-751 Implement Predicate Pushdown in C++ Reader
ORC-755 Introduce OrcFilterContext
ORC-757 Add Hashtable implementation for dictionary
ORC-780 Support LZ4 Compression in C++ Writer
ORC-797 Allow writers to get the stripe information
ORC-818 Build and test in Apple Silicon
ORC-861 Bump CMake minimum requirement to 2.8.12
ORC-867 Upgrade hive-storage-api to 2.8.1
ORC-984 Save the software version that wrote each ORC file

Known issues:

ORC-1002 Add java17 profile

ORC 1.6.11 Released

release

15 Sep 2021 dongjoon

The ORC team is excited to announce the release of ORC v1.6.11.

Released: 15 September 2021
Source code: orc-1.6.11.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-1.6.11
Maven Central: ORC 1.6.11
SHA 256: 67c17c012bd588fc…
Fixed issues: ORC-1.6.11

The new features of ORC 1.6:

ORC-14 Add column encryption.
ORC-189 Add timestamp with local timezone
ORC-203 Trim minimum and maximum string values
ORC-363 Add zstd support in Java
ORC-397 Support selectively disabling dictionaries
ORC-522 Add type annotations

Known issues:

ORC 1.5.13 Released

release

15 Sep 2021 dongjoon

The ORC team is excited to announce the release of ORC v1.5.13.

Released: 15 September 2021
Source code: orc-1.5.13.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-1.5.13
Maven Central: ORC 1.5.13
SHA 256: 45274afce558b93f…
Fixed issues: ORC-1.5.13

The new features of ORC 1.5:

ORC-179 Add ORC C++ Writer
ORC-91 Support for variable length blocks in HDFS.
ORC-199 Implement a CSV to ORC converter
ORC-344 Support for using Decimal64ColumnVector
ORC-345 Adding Decimal64StatisticsImpl
ORC-331 Support for building C++ under MSVC.
ORC-234 Support for older versions of Hadoop (>= 2.2.x)
ORC-305 Added statistics for size on disk

Known issues:

ORC 1.6.10 Released

release

10 Aug 2021 omalley

The ORC team is excited to announce the release of ORC v1.6.10..

Released: 10 August 2021
Source code: orc-1.6.10.tar.gz
GPG Signature signed by Owen O’Malley (AD1C5877)
Git tag: rel/release-1.6.10
Maven Central: ORC 1.6.10
SHA 256: 3a7347c85d18e44d…
Fixed issues: ORC-1.6.10

The new features of ORC 1.6:

ORC-14 Add column encryption.
ORC-189 Add timestamp with local timezone
ORC-203 Trim minimum and maximum string values
ORC-363 Add zstd support in Java
ORC-397 Support selectively disabling dictionaries
ORC-522 Add type annotations

Known issues:

ORC 1.6.9 Released

release

02 Jul 2021 dongjoon

The ORC team is excited to announce the release of ORC v1.6.9.

Released: 2 July 2021
Source code: orc-1.6.9.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-1.6.9
Maven Central: ORC 1.6.9
SHA 256: d19af60cd81cdb17…
Fixed issues: ORC-1.6.9

The new features of ORC 1.6:

ORC-14 Add column encryption.
ORC-189 Add timestamp with local timezone
ORC-203 Trim minimum and maximum string values
ORC-363 Add zstd support in Java
ORC-397 Support selectively disabling dictionaries
ORC-522 Add type annotations

Known issues:

ORC 1.6.8 Released

release

21 May 2021 dongjoon

The ORC team is excited to announce the release of ORC v1.6.8.

Released: 21 May 2021
Source code: orc-1.6.8.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-1.6.8
Maven Central: ORC 1.6.8
SHA 256: 93d2e5f7c9f76ea5…
Fixed issues: ORC-1.6.8

The new features of ORC 1.6:

ORC-14 Add column encryption.
ORC-189 Add timestamp with local timezone
ORC-203 Trim minimum and maximum string values
ORC-363 Add zstd support in Java
ORC-397 Support selectively disabling dictionaries
ORC-522 Add type annotations

Known issues:

William Hyun added as committer

team

13 Apr 2021 dongjoon

The ORC PMC is happy to add William Hyun as an ORC committer for the work on improving ORC’s code quality and integration to Apache Spark and Apache Iceberg.

Thank you for your work on ORC, William!

ORC adds Panagiotis Garefalakis to PMC

team

08 Feb 2021 dongjoon

On behalf of the Apache ORC Project Management Committee (PMC), it gives me great pleasure to announce that Panagiotis Garefalakis has joined the PMC. Panagiotis has radically improved the integration between Hive and ORC.

Please join me in welcoming Panagiotis to the ORC PMC!

ORC 1.6.7 Released

release

22 Jan 2021 dongjoon

The ORC team is excited to announce the release of ORC v1.6.7.

Released: 22 January 2021
Source code: orc-1.6.7.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-1.6.7
Maven Central: ORC 1.6.7
SHA 256: 93d2e5f7c9f76ea5…
Fixed issues: ORC-1.6.7

The new features of ORC 1.6:

ORC-14 Add column encryption.
ORC-189 Add timestamp with local timezone
ORC-203 Trim minimum and maximum string values
ORC-363 Add zstd support in Java
ORC-397 Support selectively disabling dictionaries
ORC-522 Add type annotations

Known issues:

ORC 1.6.6 Released

release

10 Dec 2020 dongjoon

The ORC team is excited to announce the release of ORC v1.6.6.

Released: 10 December 2020
Source code: orc-1.6.6.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-1.6.6
Maven Central: ORC 1.6.6
SHA 256: 93d2e5f7c9f76ea5…
Fixed issues: ORC-1.6.6

The new features of ORC 1.6:

ORC-14 Add column encryption.
ORC-189 Add timestamp with local timezone
ORC-203 Trim minimum and maximum string values
ORC-363 Add zstd support in Java
ORC-397 Support selectively disabling dictionaries
ORC-522 Add type annotations

Known issues:

Panagiotis Garefalakis added as committer

team

16 Nov 2020 dongjoon

The ORC PMC is happy to add Panagiotis Garefalakis as an ORC committer for the work on improving ORC’s integration to Apache Hive.

Thank you for your work on ORC, Panagiotis!

ORC 1.6.5 Released

release

01 Oct 2020 omalley

The ORC team is excited to announce the release of ORC v1.6.5.

Released: 1 October 2020
Source code: orc-1.6.5.tar.gz
GPG Signature signed by Owen O’Malley (AD1C5877)
Git tag: rel/release-1.6.5
Maven Central: ORC 1.6.5
SHA 256: 1e77840861a5c5c8…
Fixed issues: ORC-1.6.5

The new features of ORC 1.6:

ORC-14 Add column encryption.
ORC-189 Add timestamp with local timezone
ORC-203 Trim minimum and maximum string values
ORC-363 Add zstd support in Java
ORC-397 Support selectively disabling dictionaries
ORC-522 Add type annotations

Known issues:

ORC 1.5.12 Released

release

30 Sep 2020 omalley

The ORC team is excited to announce the release of ORC v1.5.12.

Released: 30 September 2020
Source code: orc-1.5.12.tar.gz
GPG Signature signed by Owen O’Malley (AD1C5877)
Git tag: rel/release-1.5.12
Maven Central: ORC 1.5.12
SHA 256: 938e48eca6c83fcd…
Fixed issues: ORC-1.5.12

The new features of ORC 1.5:

ORC-179 Add ORC C++ Writer
ORC-91 Support for variable length blocks in HDFS.
ORC-199 Implement a CSV to ORC converter
ORC-344 Support for using Decimal64ColumnVector
ORC-345 Adding Decimal64StatisticsImpl
ORC-331 Support for building C++ under MSVC.
ORC-234 Support for older versions of Hadoop (>= 2.2.x)
ORC-305 Added statistics for size on disk

Known issues:

ORC 1.6.4 Released

release

14 Sep 2020 omalley

The ORC team is excited to announce the release of ORC v1.6.4.

Released: 14 September 2020
Source code: orc-1.6.4.tar.gz
GPG Signature signed by Owen O’Malley (AD1C5877)
Git tag: rel/release-1.6.4
Maven Central: ORC 1.6.4
SHA 256: ceea9849277354cf…
Fixed issues: ORC-1.6.4

The new features of ORC 1.6:

ORC-14 Add column encryption.
ORC-189 Add timestamp with local timezone
ORC-203 Trim minimum and maximum string values
ORC-363 Add zstd support in Java
ORC-397 Support selectively disabling dictionaries
ORC-522 Add type annotations

Known issues:

ORC-667 Positional mapping for nested struct types should not applied by default

ORC 1.5.11 Released

release

14 Sep 2020 omalley

The ORC team is excited to announce the release of ORC v1.5.11.

Released: 14 September 2020
Source code: orc-1.5.11.tar.gz
GPG Signature signed by Owen O’Malley (AD1C5877)
Git tag: rel/release-1.5.11
Maven Central: ORC 1.5.11
SHA 256: 636af3a39aa8cdfc…
Fixed issues: ORC-1.5.11

The new features of ORC 1.5:

ORC-179 Add ORC C++ Writer
ORC-91 Support for variable length blocks in HDFS.
ORC-199 Implement a CSV to ORC converter
ORC-344 Support for using Decimal64ColumnVector
ORC-345 Adding Decimal64StatisticsImpl
ORC-331 Support for building C++ under MSVC.
ORC-234 Support for older versions of Hadoop (>= 2.2.x)
ORC-305 Added statistics for size on disk

Known issues:

ORC-667 Positional mapping for nested struct types should not applied by default

ORC 1.5.10 Released

release

26 Apr 2020 omalley

The ORC team is excited to announce the release of ORC v1.5.10.

Released: 26 April 2020
Source code: orc-1.5.10.tar.gz
GPG Signature signed by Owen O’Malley (AD1C5877)
Git tag: rel/release-1.5.10
Maven Central: ORC 1.5.10
SHA 256: c2ca21fd2f77afbe…
Fixed issues: ORC-1.5.10

The new features of ORC 1.5:

ORC-179 Add ORC C++ Writer
ORC-91 Support for variable length blocks in HDFS.
ORC-199 Implement a CSV to ORC converter
ORC-344 Support for using Decimal64ColumnVector
ORC-345 Adding Decimal64StatisticsImpl
ORC-331 Support for building C++ under MSVC.
ORC-234 Support for older versions of Hadoop (>= 2.2.x)
ORC-305 Added statistics for size on disk

Known issues:

ORC 1.6.3 Released

release

26 Apr 2020 omalley

The ORC team is excited to announce the release of ORC v1.6.3.

Released: 26 April 2020
Source code: orc-1.6.3.tar.gz
GPG Signature signed by Owen O’Malley (AD1C5877)
Git tag: rel/release-1.6.3
Maven Central: ORC 1.6.3
SHA 256: 38b9da9ca771d268…
Fixed issues: ORC-1.6.3

The new features of ORC 1.6:

ORC-14 Add column encryption.
ORC-189 Add timestamp with local timezone
ORC-203 Trim minimum and maximum string values
ORC-363 Add zstd support in Java
ORC-397 Support selectively disabling dictionaries
ORC-522 Add type annotations

Known issues:

ORC 1.5.9 Released

release

30 Jan 2020 omalley

The ORC team is excited to announce the release of ORC v1.5.9.

Released: 30 January 2020
Source code: orc-1.5.9.tar.gz
GPG Signature signed by Owen O’Malley (3D0C92B9)
Git tag: rel/release-1.5.9
Maven Central: ORC 1.5.9
SHA 256: 75c534555df8a932…
Fixed issues: ORC-1.5.9

The new features of ORC 1.5:

ORC-179 Add ORC C++ Writer
ORC-91 Support for variable length blocks in HDFS.
ORC-199 Implement a CSV to ORC converter
ORC-344 Support for using Decimal64ColumnVector
ORC-345 Adding Decimal64StatisticsImpl
ORC-331 Support for building C++ under MSVC.
ORC-234 Support for older versions of Hadoop (>= 2.2.x)
ORC-305 Added statistics for size on disk

Known issues:

ORC adds Dongjoon Hyun to PMC

team

09 Dec 2019 omalley

On behalf of the Apache ORC Project Management Committee (PMC), it gives me great pleasure to announce that Dongjoon Hyun has joined the PMC. Dongjoon has radically improved the integration between Spark and ORC.

Please join me in welcoming Dongjoon to the ORC PMC!

ORC 1.4.5 Released

release

09 Dec 2019 omalley

The ORC team is excited to announce the release of ORC v1.4.5.

Released: 9 December 2019
Source code: orc-1.4.5.tar.gz
GPG Signature signed by Owen O’Malley (3D0C92B9)
Git tag: rel/release-1.4.5
Maven Central: ORC 1.4.5
SHA 256: 6b30272d4c4cbccc…
Fixed issues: ORC-1.4.5

The new features of ORC 1.4:

ORC-72 Add benchmark code for file formats.
ORC-87 Fix timestamp statistics in C++.
ORC-150 Add tool to convert from JSON.
ORC-151 Reduce the size of tools.jar.
ORC-174 Create a nohive variant of the jars.

Known issues:

ORC 1.6.2 Released

release

24 Nov 2019 omalley

The ORC team is excited to announce the release of ORC v1.6.2.

Released: 24 November 2019
Source code: orc-1.6.2.tar.gz
GPG Signature signed by Owen O’Malley (3D0C92B9)
Git tag: rel/release-1.6.2
Maven Central: ORC 1.6.2
SHA 256: 5c394603faba3c50…
Fixed issues: ORC-1.6.2

The new features of ORC 1.6:

ORC-14 Add column encryption.
ORC-189 Add timestamp with local timezone
ORC-203 Trim minimum and maximum string values
ORC-363 Add zstd support in Java
ORC-397 Support selectively disabling dictionaries
ORC-522 Add type annotations

Known issues:

ORC 1.5.8 Released

release

24 Nov 2019 omalley

The ORC team is excited to announce the release of ORC v1.5.8.

Released: 24 November 2019
Source code: orc-1.5.8.tar.gz
GPG Signature signed by Owen O’Malley (3D0C92B9)
Git tag: rel/release-1.5.8
Maven Central: ORC 1.5.8
SHA 256: 2caf689132168d34…
Fixed issues: ORC-1.5.8

The new features of ORC 1.5:

ORC-179 Add ORC C++ Writer
ORC-91 Support for variable length blocks in HDFS.
ORC-199 Implement a CSV to ORC converter
ORC-344 Support for using Decimal64ColumnVector
ORC-345 Adding Decimal64StatisticsImpl
ORC-331 Support for building C++ under MSVC.
ORC-234 Support for older versions of Hadoop (>= 2.2.x)
ORC-305 Added statistics for size on disk

Known issues:

ORC 1.6.1 Released

release

26 Oct 2019 omalley

The ORC team is excited to announce the release of ORC v1.6.1.

Released: 26 October 2019
Source code: orc-1.6.1.tar.gz
GPG Signature signed by Owen O’Malley (3D0C92B9)
Git tag: rel/release-1.6.1
Maven Central: ORC 1.6.1
SHA 256: 56a7622629f0101f…
Fixed issues: ORC-1.6.1

The new features of ORC 1.6:

ORC-14 Add column encryption.
ORC-189 Add timestamp with local timezone
ORC-203 Trim minimum and maximum string values
ORC-363 Add zstd support in Java
ORC-397 Support selectively disabling dictionaries
ORC-522 Add type annotations

Known issues:

ORC-562 Don’t wrap the readerSchema with ACID fields, if it already is
ORC-569 The first index entry may have empty positions
ORC-571 ArrayIndexOutOfBoundsException in StripePlanner.readRowIndex

ORC 1.5.7 Released

release

26 Oct 2019 omalley

The ORC team is excited to announce the release of ORC v1.5.7.

Released: 26 October 2019
Source code: orc-1.5.7.tar.gz
GPG Signature signed by Owen O’Malley (3D0C92B9)
Git tag: rel/release-1.5.7
Maven Central: ORC 1.5.7
SHA 256: 0fbc5c6da16be89e…
Fixed issues: ORC-1.5.7

The new features of ORC 1.5:

ORC-179 Add ORC C++ Writer
ORC-91 Support for variable length blocks in HDFS.
ORC-199 Implement a CSV to ORC converter
ORC-344 Support for using Decimal64ColumnVector
ORC-345 Adding Decimal64StatisticsImpl
ORC-331 Support for building C++ under MSVC.
ORC-234 Support for older versions of Hadoop (>= 2.2.x)
ORC-305 Added statistics for size on disk

Known issues:

ORC-562 Don’t wrap the readerSchema with ACID fields, if it already is
ORC-569 The first index entry may have empty positions

ORC 1.6.0 Released

release

03 Sep 2019 omalley

The ORC team is excited to announce the release of ORC v1.6.0.

Released: 3 September 2019
Source code: orc-1.6.0.tar.gz
GPG Signature signed by Owen O’Malley (3D0C92B9)
Git tag: rel/release-1.6.0
Maven Central: ORC 1.6.0
SHA 256: 2d864000c60025f5…
Fixed issues: ORC-1.6.0

The new features of ORC 1.6:

ORC-14 Add column encryption.
ORC-189 Add timestamp with local timezone
ORC-203 Trim minimum and maximum string values
ORC-363 Add zstd support in Java
ORC-397 Support selectively disabling dictionaries
ORC-522 Add type annotations

Known issues:

ORC-414 ORC files with malformed protobuf objects can crash C++ reader
ORC-555 IllegalArgumentException when reading files with large footers
ORC-562 Don’t wrap the readerSchema with ACID fields, if it already is
ORC-569 The first index entry may have empty positions
ORC-571 ArrayIndexOutOfBoundsException in StripePlanner.readRowIndex

ORC 1.5.6 Released

release

27 Jun 2019 omalley

The ORC team is excited to announce the release of ORC v1.5.6.

Users are advised that as of ORC 1.5.6, ORCReaders that aren’t used to create RecordReaders should be closed.

Released: 27 June 2019
Source code: orc-1.5.6.tar.gz
GPG Signature signed by Owen O’Malley (3D0C92B9)
Git tag: rel/release-1.5.6
Maven Central: ORC 1.5.6
SHA 256: e0588bfd96103bc1…
Fixed issues: ORC-1.5.6

The new features of ORC 1.5:

ORC-179 Add ORC C++ Writer
ORC-91 Support for variable length blocks in HDFS.
ORC-199 Implement a CSV to ORC converter
ORC-344 Support for using Decimal64ColumnVector
ORC-345 Adding Decimal64StatisticsImpl
ORC-331 Support for building C++ under MSVC.
ORC-234 Support for older versions of Hadoop (>= 2.2.x)
ORC-305 Added statistics for size on disk

Known issues:

ORC-525 Users must close ORC Readers after use
ORC-414 ORC files with malformed protobuf objects can crash C++ reader
ORC-562 Don’t wrap the readerSchema with ACID fields, if it already is
ORC-569 The first index entry may have empty positions

Renat Vailiullin and Sandeep More added as committers

team

10 Jun 2019 omalley

The ORC PMC is happy to add Renat Vailiullin and Sandeep More as an ORC committers. Renat has done a lot of work to improve the Windows builds and Sandeep has been working on the data masking and statistics.

Thank you for your work on ORC, Renat and Sandeep!

ORC 1.5.5 Released

release

14 Mar 2019 omalley

The ORC team is excited to announce the release of ORC v1.5.5.

Released: 14 March 2019
Source code: orc-1.5.5.tar.gz
GPG Signature signed by Owen O’Malley (3D0C92B9)
Git tag: rel/release-1.5.5
Maven Central: ORC 1.5.5
SHA 256: 486bbf0765a5b8c2…
Fixed issues: ORC-1.5.5

The new features of ORC 1.5:

ORC-179 Add ORC C++ Writer
ORC-91 Support for variable length blocks in HDFS.
ORC-199 Implement a CSV to ORC converter
ORC-344 Support for using Decimal64ColumnVector
ORC-345 Adding Decimal64StatisticsImpl
ORC-331 Support for building C++ under MSVC.
ORC-234 Support for older versions of Hadoop (>= 2.2.x)
ORC-305 Added statistics for size on disk

Known issues:

ORC-414 ORC files with malformed protobuf objects can crash C++ reader
ORC-562 Don’t wrap the readerSchema with ACID fields, if it already is
ORC-569 The first index entry may have empty positions

ORC adds Gang Wu to PMC

team

11 Jan 2019 omalley

On behalf of the Apache ORC Project Management Committee (PMC), it gives me great pleasure to announce that Gang Wu has joined the PMC. Gang has been doing great work on the C++ code base.

Please join me in welcoming Gang to the ORC PMC!

Dongjoon Hyun added as committer

team

10 Jan 2019 omalley

The ORC PMC is happy to add Dongjoon Hyun as an ORC committer for the work on improving ORC’s integration to Spark.

Thank you for your work on ORC, Dongjoon!

ORC 1.5.4 Released

release

21 Dec 2018 vgumashta

The ORC team is excited to announce the release of ORC v1.5.4.

Released: 21 December 2018
Source code: orc-1.5.4.tar.gz
GPG Signature signed by Vaibhav Gumashta (F60037FB)
Git tag: rel/release-1.5.4
Maven Central: ORC 1.5.4
SHA 256: 75cfba40d3574c14…
Fixed issues: ORC-1.5.4

The new features of ORC 1.5:

ORC-179 Add ORC C++ Writer
ORC-91 Support for variable length blocks in HDFS.
ORC-199 Implement a CSV to ORC converter
ORC-344 Support for using Decimal64ColumnVector
ORC-345 Adding Decimal64StatisticsImpl
ORC-331 Support for building C++ under MSVC.
ORC-234 Support for older versions of Hadoop (>= 2.2.x)
ORC-305 Added statistics for size on disk

Known issues:

ORC-414 ORC files with malformed protobuf objects can crash C++ reader
ORC-562 Don’t wrap the readerSchema with ACID fields, if it already is
ORC-569 The first index entry may have empty positions

ORC 1.5.3 Released

release

25 Sep 2018 omalley

The ORC team is excited to announce the release of ORC v1.5.3.

Released: 25 September 2018
Source code: orc-1.5.3.tar.gz
GPG Signature signed by Owen O’Malley (3D0C92B9)
Git tag: rel/release-1.5.3
Maven Central: ORC 1.5.3
SHA 256: 96da3cccd2b396dc…
Fixed issues: ORC-1.5.3

The new features of ORC 1.5:

ORC-179 Add ORC C++ Writer
ORC-91 Support for variable length blocks in HDFS.
ORC-199 Implement a CSV to ORC converter
ORC-344 Support for using Decimal64ColumnVector
ORC-345 Adding Decimal64StatisticsImpl
ORC-331 Support for building C++ under MSVC.
ORC-234 Support for older versions of Hadoop (>= 2.2.x)
ORC-305 Added statistics for size on disk

Known issues:

ORC-414 ORC files with malformed protobuf objects can crash C++ reader
ORC-562 Don’t wrap the readerSchema with ACID fields, if it already is
ORC-569 The first index entry may have empty positions

ORC 1.5.2 Released

release

29 Jun 2018 prasanthj

The ORC team is excited to announce the release of ORC v1.5.2.

Released: 29 June 2018
Source code: orc-1.5.2.tar.gz
GPG Signature signed by Prasanth Jayachandran (65C468A3)
Git tag: rel/release-1.5.2
Maven Central: ORC 1.5.2
SHA 256: 4b73de720f54448d…
Fixed issues: ORC-1.5.2

The new features of ORC 1.5:

ORC-179 Add ORC C++ Writer
ORC-91 Support for variable length blocks in HDFS.
ORC-199 Implement a CSV to ORC converter
ORC-344 Support for using Decimal64ColumnVector
ORC-345 Adding Decimal64StatisticsImpl
ORC-331 Support for building C++ under MSVC.
ORC-234 Support for older versions of Hadoop (>= 2.2.x)
ORC-305 Added statistics for size on disk

Known issues:

ORC-414 ORC files with malformed protobuf objects can crash C++ reader
ORC-562 Don’t wrap the readerSchema with ACID fields, if it already is
ORC-569 The first index entry may have empty positions

ORC 1.5.1 Released

release

25 May 2018 omalley

The ORC team is excited to announce the release of ORC v1.5.1.

Released: 25 May 2018
Source code: orc-1.5.1.tar.gz
GPG Signature signed by Owen O’Malley (3D0C92B9)
Git tag: rel/release-1.5.1
Maven Central: ORC 1.5.1
SHA 256: 14b93916ac6dce65…
Fixed issues: ORC-1.5.1

The new features of ORC 1.5:

ORC-179 Add ORC C++ Writer
ORC-91 Support for variable length blocks in HDFS.
ORC-199 Implement a CSV to ORC converter
ORC-344 Support for using Decimal64ColumnVector
ORC-345 Adding Decimal64StatisticsImpl
ORC-331 Support for building C++ under MSVC.
ORC-234 Support for older versions of Hadoop (>= 2.2.x)
ORC-305 Added statistics for size on disk

Known issues:

ORC-414 ORC files with malformed protobuf objects can crash C++ reader
ORC-562 Don’t wrap the readerSchema with ACID fields, if it already is
ORC-569 The first index entry may have empty positions

ORC 1.5.0 Released

release

14 May 2018 omalley

The ORC team is excited to announce the release of ORC v1.5.0.

Released: 14 May 2018
Source code: orc-1.5.0.tar.gz
GPG Signature signed by Owen O’Malley (3D0C92B9)
Git tag: rel/release-1.5.0
Maven Central: ORC 1.5.0
SHA 256: 28369ea8e24cac6d…
Fixed issues: ORC-1.5.0

The new features of ORC 1.5:

ORC-179 Add ORC C++ Writer
ORC-91 Support for variable length blocks in HDFS.
ORC-199 Implement a CSV to ORC converter
ORC-344 Support for using Decimal64ColumnVector
ORC-345 Adding Decimal64StatisticsImpl
ORC-331 Support for building C++ under MSVC.
ORC-234 Support for older versions of Hadoop (>= 2.2.x)
ORC-305 Added statistics for size on disk

Known issues:

ORC-367 Boolean columns are read incorrectly when using seek.
ORC-414 ORC files with malformed protobuf objects can crash C++ reader
ORC-562 Don’t wrap the readerSchema with ACID fields, if it already is
ORC-569 The first index entry may have empty positions

ORC 1.4.4 Released

release

14 May 2018 omalley

The ORC team is excited to announce the release of ORC v1.4.4.

Released: 14 May 2018
Source code: orc-1.4.4.tar.gz
GPG Signature signed by Owen O’Malley (3D0C92B9)
Git tag: rel/release-1.4.4
Maven Central: ORC 1.4.4
SHA 256: 9df0f59ba4046d2a…
Fixed issues: ORC-1.4.4

The new features of ORC 1.4:

ORC-72 Add benchmark code for file formats.
ORC-87 Fix timestamp statistics in C++.
ORC-150 Add tool to convert from JSON.
ORC-151 Reduce the size of tools.jar.
ORC-174 Create a nohive variant of the jars.

Known issues:

Gang Wu and Xiening Dai added as committer

team

27 Mar 2018 omalley

The ORC PMC is happy to add Gang Wu and Xiening Dai as ORC committers for their work on the C++ ORC writer.

Thank you for your work on ORC, Gang and Xiening!

ORC 1.4.3 Released

release

09 Feb 2018 omalley

The ORC team is excited to announce the release of ORC v1.4.3.

Released: 9 February 2018
Source code: orc-1.4.3.tar.gz
GPG Signature signed by Owen O’Malley (3D0C92B9)
Git tag: rel/release-1.4.3
Maven Central: ORC 1.4.3
SHA 256: 0310d6ed20d95b7c…
Fixed issues: ORC-1.4.3

The new features of ORC 1.4:

ORC-72 Add benchmark code for file formats.
ORC-87 Fix timestamp statistics in C++.
ORC-150 Add tool to convert from JSON.
ORC-151 Reduce the size of tools.jar.
ORC-174 Create a nohive variant of the jars.

Known issues:

CVE-2018-8015 ORC files with malformed types cause stack overflow.

ORC 1.4.2 Released

release

23 Jan 2018 omalley

The ORC team is excited to announce the release of ORC v1.4.2.

Released: 23 January 2018
Source code: orc-1.4.2.tar.gz
GPG Signature signed by Owen O’Malley (3D0C92B9)
Git tag: rel/release-1.4.2
Maven Central: ORC 1.4.2
SHA 256: 4c32e30a2b93953c…
Fixed issues: ORC-1.4.2

The new features of ORC 1.4:

ORC-72 Add benchmark code for file formats.
ORC-87 Fix timestamp statistics in C++.
ORC-150 Add tool to convert from JSON.
ORC-151 Reduce the size of tools.jar.
ORC-174 Create a nohive variant of the jars.

Known issues:

CVE-2018-8015 ORC files with malformed types cause stack overflow.
ORC-285 Empty vector batches of floats or doubles cause EOFException

ORC 1.4.1 Released

release

16 Oct 2017 prasanthj

The ORC team is excited to announce the release of ORC v1.4.1.

Released: 16 October 2017
Source code: orc-1.4.1.tar.gz
GPG Signature signed by Prasanth Jayachandran (65C468A3)
Git tag: rel/release-1.4.1
Maven Central: ORC 1.4.1
SHA 256: bf9f107c61ecd6a9…
Fixed issues: ORC-1.4.1

The new features of ORC 1.4:

ORC-72 Add benchmark code for file formats.
ORC-87 Fix timestamp statistics in C++.
ORC-150 Add tool to convert from JSON.
ORC-151 Reduce the size of tools.jar.
ORC-174 Create a nohive variant of the jars.

Known issues:

CVE-2018-8015 ORC files with malformed types cause stack overflow.
ORC-285 Empty vector batches of floats or doubles cause EOFException

ORC 1.3.4 Released

release

16 Oct 2017 prasanthj

The ORC team is excited to announce the release of ORC v1.3.4.

Released: 16 October 2017
Source code: orc-1.3.4.tar.gz
GPG Signature signed by Prasanth Jayachandran (65C468A3)
Git tag: rel/release-1.3.4
Maven Central: ORC 1.3.4
SHA 256: 55269430aea7b825…
Fixed issues: ORC-1.3.4

The new features of ORC 1.3:

ORC-58 Split C++ Reader into Reader and RowReader
ORC-120 Add backwards compatibility mode for schema evolution.
ORC-124 Fast decimal improvements
ORC-128 Add ability to get statistics from writer

Known issues:

CVE-2018-8015 ORC files with malformed types cause stack overflow.
ORC-285 Empty vector batches of floats or doubles cause EOFException

ORC adds Eugene and Deepak to PMC

team

06 Sep 2017 omalley

On behalf of the Apache ORC Project Management Committee (PMC), it gives me great pleasure to announce that Eugene Koifman and Deepak Majeti have joined the PMC. Eugene has been critical working on ACID and Deepak has been doing great work on the C++ code base.

Please join me in welcoming Eugene and Deepak to the ORC PMC!

Deepak Majeti added as committer

team

16 May 2017 omalley

The ORC PMC is happy to add Deepak Majeti as an ORC committer for the work on the C++ ORC reader including both contributions and reviews of other’s patches. Thank you for your work on ORC, Deepak!

ORC 1.4.0 Released

release

08 May 2017 omalley

The ORC team is excited to announce the release of ORC v1.4.0.

Released: 8 May 2017
Source code: orc-1.4.0.tar.gz
GPG Signature signed by Owen O’Malley (3D0C92B9)
Git tag: rel/release-1.4.0
Maven Central: ORC 1.4.0
SHA 256: 0f96b2096dd053b6…
Fixed issues: ORC-1.4.0

The new features of ORC 1.4:

ORC-72 Add benchmark code for file formats.
ORC-87 Fix timestamp statistics in C++.
ORC-150 Add tool to convert from JSON.
ORC-151 Reduce the size of tools.jar.
ORC-174 Create a nohive variant of the jars.

Known issues:

CVE-2018-8015 ORC files with malformed types cause stack overflow.
ORC-285 Empty vector batches of floats or doubles cause EOFException

ORC 1.3.3 Released

release

21 Feb 2017 omalley

The ORC team is excited to announce the release of ORC v1.3.3.

Released: 21 February 2017
Source code: orc-1.3.3.tar.gz
GPG Signature signed by Owen O’Malley (3D0C92B9)
Git tag: rel/release-1.3.3
Maven Central: ORC 1.3.3
SHA 256: 48cf9f47ab13f4ba…
Fixed issues: ORC-1.3.3

The new features of ORC 1.3:

ORC-58 Split C++ Reader into Reader and RowReader
ORC-120 Add backwards compatibility mode for schema evolution.
ORC-124 Fast decimal improvements
ORC-128 Add ability to get statistics from writer

Known issues:

CVE-2018-8015 ORC files with malformed types cause stack overflow.
ORC-285 Empty vector batches of floats or doubles cause EOFException

ORC 1.3.2 Released

release

13 Feb 2017 omalley

The ORC team is excited to announce the release of ORC v1.3.2.

Released: 13 February 2017
Source code: orc-1.3.2.tar.gz
GPG Signature signed by Owen O’Malley (3D0C92B9)
Git tag: rel/release-1.3.2
Maven Central: ORC 1.3.2
SHA 256: 929b70f63e2caf3e…
Fixed issues: ORC-1.3.2

The new features of ORC 1.3:

ORC-58 Split C++ Reader into Reader and RowReader
ORC-120 Add backwards compatibility mode for schema evolution.
ORC-124 Fast decimal improvements
ORC-128 Add ability to get statistics from writer

Known issues:

CVE-2018-8015 ORC files with malformed types cause stack overflow.
ORC-285 Empty vector batches of floats or doubles cause EOFException

ORC 1.3.1 Released

release

03 Feb 2017 omalley

The ORC team is excited to announce the release of ORC v1.3.1.

Released: 3 February 2017
Source code: orc-1.3.1.tar.gz
GPG Signature signed by Owen O’Malley (3D0C92B9)
Git tag: rel/release-1.3.1
Maven Central: ORC 1.3.1
SHA 256: d16c55f20f9fe217…
Fixed issues: ORC-1.3.1

The new features of ORC 1.3:

ORC-58 Split C++ Reader into Reader and RowReader
ORC-120 Add backwards compatibility mode for schema evolution.
ORC-124 Fast decimal improvements
ORC-128 Add ability to get statistics from writer

Known issues:

CVE-2018-8015 ORC files with malformed types cause stack overflow.
ORC-135 Predicate push down is incorrect on timestamps when moved between timezones
ORC-285 Empty vector batches of floats or doubles cause EOFException

ORC 1.3.0 Released

release

23 Jan 2017 omalley

The ORC team is excited to announce the release of ORC v1.3.0.

Released: 23 January 2017
Source code: orc-1.3.0.tar.gz
GPG Signature signed by Owen O’Malley (3D0C92B9)
Git tag: rel/release-1.3.0
Maven Central: ORC 1.3.0
SHA 256: d19a5b5cc1df5797…
Fixed issues: ORC-1.3.0

The new features of ORC 1.3:

ORC-58 Split C++ Reader into Reader and RowReader
ORC-120 Add backwards compatibility mode for schema evolution.
ORC-124 Fast decimal improvements
ORC-128 Add ability to get statistics from writer

Known issues:

ORC-135 Predicate push down is incorrect on timestamps when moved between timezones
ORC-285 Empty vector batches of floats or doubles cause EOFException

ORC adds Gopal Vijayaraghavan to PMC

team

04 Jan 2017 omalley

On behalf of the Apache ORC Project Management Committee (PMC), it gives me great pleasure to announce that Gopal Vijayaraghavan has joined the PMC. Gopal has done an amazing job at speeding up ORC in many ways.

Please join me in welcoming Gopal to the ORC PMC!

Congratulations Gopal!

ORC adds new committers

team

15 Dec 2016 omalley

As part of the removal of the ORC code base from Hive, the ORC PMC has offered to make any existing Hive committers into ORC committers. The new ORC committers coming from Hive are:

Aihua Xu
Ashutosh Chauhan
Carl Steinbach
Chaoyu Tang
Chinna Rao Lalam
Daniel Dai
Eugene Koifman
Ferdinand Xu
Jason Dere
Jesus Camacho Rodriguez
Jimmy Xiang
Lars Francke
Matthew McCline
Mithun Radhakrishnan
Naveen Gangam
Pengcheng Xiong
Rajesh Balamohan
Rui Li
Sergio Pena
Siddharth Seth
Vaibhav Gumashta
Wei Zheng
Yongzhi Chen

ORC 1.2.3 Released

release

12 Dec 2016 omalley

The ORC team is excited to announce the release of ORC v1.2.3. This release fixes some bugs in the Java schema evolution code.

Released: 12 December 2016
Source code: orc-1.2.3.tar.gz
GPG Signature signed by Owen O’Malley (3D0C92B9)
Git tag: rel/release-1.2.3
Maven Central: ORC 1.2.3
SHA 256: a86a335052553bc5…
Fixed issues: ORC-1.2.3

The new features of ORC 1.2:

ORC-54 Evolve schemas based on field name rather than index
ORC-84 Create a separate java tool module.
ORC-77 and ORC-81 Implement LZO and LZ4 compression codecs.
ORC-92 Add support for nested column id selection in C++
ORC-69 Add batch option support in orc-scan tools.

Important fixes:

HIVE-14214 ORC schema evolution and predicate push down do not work together.

Known issues:

CVE-2018-8015 ORC files with malformed types cause stack overflow.
ORC-135 Predicate push down is incorrect on timestamps when moved between timezones
ORC-285 Empty vector batches of floats or doubles cause EOFException

ORC 1.2.2 Released

release

01 Dec 2016 omalley

The ORC team is excited to announce the release of ORC v1.2.2.

Released: 1 December 2016
Source code: orc-1.2.2.tar.gz
GPG Signature signed by Owen O’Malley (3D0C92B9)
Git tag: rel/release-1.2.2
Maven Central: ORC 1.2.2
SHA 256: 6aa87390f0f03c43…
Fixed issues: ORC-1.2.2

The new features of ORC 1.2:

ORC-54 Evolve schemas based on field name rather than index
ORC-84 Create a separate java tool module.
ORC-77 and ORC-81 Implement LZO and LZ4 compression codecs.
ORC-92 Add support for nested column id selection in C++
ORC-69 Add batch option support in orc-scan tools.

Important fixes:

HIVE-14214 ORC schema evolution and predicate push down do not work together.

Known issues:

CVE-2018-8015 ORC files with malformed types cause stack overflow.
ORC-135 Predicate push down is incorrect on timestamps when moved between timezones
ORC-285 Empty vector batches of floats or doubles cause EOFException

ORC 1.2.1 Released

release

05 Oct 2016 omalley

The ORC team is excited to announce the release of ORC v1.2.1.

Released: 5 October 2016
Source code: orc-1.2.1.tar.gz
GPG Signature signed by Owen O’Malley (3D0C92B9)
Git tag: rel/release-1.2.1
Maven Central: ORC 1.2.1
SHA 256: 793bcc0419574fba…
Fixed issues: ORC-1.2.1

The new features of ORC 1.2:

ORC-54 Evolve schemas based on field name rather than index
ORC-84 Create a separate java tool module.
ORC-77 and ORC-81 Implement LZO and LZ4 compression codecs.
ORC-92 Add support for nested column id selection in C++
ORC-69 Add batch option support in orc-scan tools.

Important fixes:

HIVE-14214 ORC schema evolution and predicate push down do not work together.

Known issues:

CVE-2018-8015 ORC files with malformed types cause stack overflow.
ORC-135 Predicate push down is incorrect on timestamps when moved between timezones
ORC-285 Empty vector batches of floats or doubles cause EOFException

ORC 1.2.0 Released

release

25 Aug 2016 omalley

The ORC team is excited to announce the release of ORC v1.2.0.

Released: 25 August 2016
Source code: orc-1.2.0.tar.gz
GPG Signature signed by Owen O’Malley (3D0C92B9)
Git tag: rel/release-1.2.0
Maven Central: ORC 1.2.0
SHA 256: 5c394c7ed3a31d20…
Fixed issues: ORC-1.2.0

The new features of ORC 1.2:

ORC-54 Evolve schemas based on field name rather than index
ORC-84 Create a separate java tool module.
ORC-77 and ORC-81 Implement LZO and LZ4 compression codecs.
ORC-92 Add support for nested column id selection in C++
ORC-69 Add batch option support in orc-scan tools.

Important fixes:

HIVE-14214 ORC schema evolution and predicate push down do not work together.

Known issues:

CVE-2018-8015 ORC files with malformed types cause stack overflow.
ORC-101 Bloom filters for string and decimal use inconsistent encoding
ORC-135 Predicate push down is incorrect on timestamps when moved between timezones
ORC-285 Empty vector batches of floats or doubles cause EOFException

ORC 1.1.2 Released

release

08 Jul 2016 omalley

The ORC team is excited to announce the release of ORC v1.1.2. This release contains the Java reader and writer and the native C++ ORC reader and tools.

Released: 8 July 2016
Source code: orc-1.1.2.tgz
GPG Signature signed by Owen O’Malley (3D0C92B9)
Git tag: rel/release-1.1.2
Maven Central: ORC 1.1.2
SHA 256: 5d14df7d48126dd8…
Fixed issues: ORC-1.1.2

The major new features in ORC 1.1 are:

ORC-1 Copy the Java ORC code from Hive.
ORC-10 Fix the C++ reader to correctly read timestamps from timezones with different daylight savings rules.
ORC-52 Add mapred and mapreduce connectors.

Known issues:

CVE-2018-8015 ORC files with malformed types cause stack overflow.
HIVE-14214 Schema evolution and predicate pushdown don’t work together.
ORC-101 Bloom filters for string and decimal use inconsistent encoding
ORC-135 Predicate push down is incorrect on timestamps when moved between timezones
ORC-285 Empty vector batches of floats or doubles cause EOFException

File format benchmark

talk

28 Jun 2016 omalley

I gave a talk at Hadoop Summit San Jose 2016 about a file format benchmark that I’ve contributed as ORC-72. The benchmark focuses on real data sets that are publicly available. The data sets represent a wide variety of use cases:

NYC Taxi Data - very dense data with mostly numeric types
Github Archives - very sparse data with a lot of complex structure
Sales - a real production schema from a sales table with a synthetic generator

The benchmarks look at a set of three very common use cases:

Full table scan - read all columns and rows
Column projection - read some columns, but all of the rows
Column projection and predicate push down - read some columns and some rows

You can see the slides here:

File Format Benchmarks: Avro, JSON, ORC, & Parquet

ORC 1.1.1 Released

release

13 Jun 2016 omalley

The ORC team is excited to announce the release of ORC v1.1.1. This release contains the Java reader and writer and the native C++ ORC reader and tools.

Released: 13 June 2016
Source code: orc-1.1.1.tgz
GPG Signature signed by Owen O’Malley (3D0C92B9)
Git tag: rel/release-1.1.1
Maven Central: ORC 1.1.1
SHA 256: 19292a1848672c9c…
Fixed issues: ORC-1.1.1

The major new features in ORC 1.1 are:

ORC-1 Copy the Java ORC code from Hive.
ORC-10 Fix the C++ reader to correctly read timestamps from timezones with different daylight savings rules.
ORC-52 Add mapred and mapreduce connectors.

Known issues:

CVE-2018-8015 ORC files with malformed types cause stack overflow.
HIVE-14214 Schema evolution and predicate pushdown don’t work together.
ORC-101 Bloom filters for string and decimal use inconsistent encoding
ORC-135 Predicate push down is incorrect on timestamps when moved between timezones
ORC-285 Empty vector batches of floats or doubles cause EOFException

ORC 1.1.0 Released

release

10 Jun 2016 omalley

The ORC team is excited to announce the release of ORC v1.1.0. This release contains the Java reader and writer and the native C++ ORC reader and tools.

Release Artifacts:

Released: 10 June 2016
Source code: orc-1.1.0.tgz
GPG Signature signed by Owen O’Malley (3D0C92B9)
Git tag: rel/release-1.1.0
Maven Central: ORC 1.1.0
SHA 256: 8beea2be064baf37…
Fixed issues: ORC-1.1.0

The major new features in ORC 1.1 are:

ORC-1 Copy the Java ORC code from Hive.
ORC-10 Fix the C++ reader to correctly read timestamps from timezones with different daylight savings rules.
ORC-52 Add mapred and mapreduce connectors.

Known issues:

CVE-2018-8015 ORC files with malformed types cause stack overflow.
HIVE-14214 Schema evolution and predicate pushdown don’t work together.
ORC-101 Bloom filters for string and decimal use inconsistent encoding
ORC-135 Predicate push down is incorrect on timestamps when moved between timezones
ORC-285 Empty vector batches of floats or doubles cause EOFException

ORC 1.0.0 Released

release

25 Jan 2016 omalley

The ORC team is excited to announce the release of ORC v1.0.0. This release contains the native C++ ORC reader and some tools.

Released: 25 January 2016
Source code: orc-1.0.0.tgz
GPG Signature signed by Owen O’Malley (3D0C92B9)
Git tag: rel/release-1.0.0
Maven Central: ORC 1.0.0
SHA 256: 8ad5111f0ca3b72f…
Fixed issues: ORC-1.0.0

The major features:

Portable pure C++ ORC reader
The C++ reader is known to work on:
- CentOS and RHEL 5, 6, and 7
- Debian 6 and 7
- Ubuntu 12 and 14
- Mac OS 10.10 and 10.11
A file-contents command that prints the contents of the file as json records.
A file-metadata command that prints the metadata of the file.
Docker files for building and testing on various Linux distributions.
Memory estimation for the reader.

Known issues:

CVE-2018-8015 ORC files with malformed types cause stack overflow.
ORC-10 When moving ORC files between timezones, different daylight savings rules will cause timestamps to shift in the C++ reader.

ORC adds Aliaksei Sandryhaila to PMC

team

19 Nov 2015 omalley

On behalf of the Apache ORC Project Management Committee (PMC), it gives me great pleasure to announce that Aliaksei Sandryhaila has joined the Apache ORC PMC. He has done lot of good work on ORC and I’m looking forward to more.

Please join me in welcoming Aliaksei to ORC PMC!

Congratulations Aliaksei!

ORC adopts new logo

project

26 Jun 2015 omalley

The ORC project has adopted a new logo. We hope you like it.

Other great options included a big white hand on a black shield. smile

ORC adds 7 committers

team

11 May 2015 omalley

The ORC project management committee today added seven new committers for their work on ORC. Welcome all!

Gunther Hagleitner
Aliaksei Sandryhaila
Sergey Shelukhin
Gopal Vijayaraghavan
Stephen Walkauskas
Kevin Wilfong
Xuefu Zhang

ORC becomes an Apache Top Level Project

project

22 Apr 2015 omalley

Today Apache ORC became a top level project at the Apache Software Foundation. This step represents a major step forward for the project, and is representative of its momentum.

Back in January 2013, we created ORC files as part of the initiative to massively speed up Apache Hive and improve the storage efficiency of data stored in Apache Hadoop. We added it as a feature of Hive for two reasons:

To ensure that it would be well integrated with Hive
To ensure that storing data in ORC format would be as simple as stating “stored as ORC” to your table definition.

In the last two years, many of the features that we’ve added to Hive, such as vectorization, ACID, predicate push down and LLAP, support ORC first, and follow up with other storage formats later.

The growing use and acceptance of ORC has encouraged additional Hadoop execution engines, such as Apache Pig, Map-Reduce, Cascading, and Apache Spark to support reading and writing ORC. However, there are concerns that depending on the large Hive jar that contains ORC pulls in a lot of other projects that Hive depends on. To better support these non-Hive users, we decided to split off from Hive and become a separate project. This will not only allow us to support Hive, but also provide a much more streamlined jar, documentation and help for users outside of Hive.

Although Hadoop and its ecosystem are largely written in Java, there are a lot of applications in other languages that would like to natively access ORC files in HDFS. Hortonworks, HP, and Microsoft are developing a pure C++ ORC reader and writer that enables C++ applications to read and write ORC files efficiently without Java. That code will also be moved into Apache ORC and released together with the Java implementation.