The ORC team is excited to announce the release of ORC v2.3.0.

New features:

  • ORC-2119: Support Java 25
  • ORC-2075: Support new Lz4Codec based on lz4-java
  • ORC-2083: Support XerialSnappyCodec
  • ORC-1986: Trigger flush stripe for large input rows
  • ORC-2000: [C++] Add support to prefetch small stripes
  • ORC-2002: [C++] Improve stripe prefetch
  • ORC-1969: [C++] Support async I/O prefetch of next stripe
  • ORC-2013: [C++] Bump CMake minimum requirement to 3.25 to leverage FetchContent

The improvements:

  • ORC-1994: [C++] Improve CMake by extracting OrcSanitizers.cmake
  • ORC-2008: [C++] Simplify CMake flags and compile options
  • ORC-2009: Remove unused code for CMake 3.6 and older
  • ORC-2014: Rename variables and configurations for periodic stripe size and dictionary size checks
  • ORC-2021: Fallback to UTC when /etc/localtime does not exist
  • ORC-2022: [C++] Add support to use dictionary for IN expression
  • ORC-2029: support Float fast read by memcpy in DoubleColumnReader
  • ORC-2031: Document orc.dictionary.max.size.bytes and orc.stripe.size.check.ratio
  • ORC-2036: optimize SortedStringDictionary performance
  • ORC-2038: Improve error message in TypeDescription.withPrecision()
  • ORC-2048: Use Java InputStream.skipNBytes instead of IOUtils.skipFully
  • ORC-2049: Move MAX_ARRAY_SIZE to RecordReaderUtils from IOUtils
  • ORC-2052: Remove unused IOUtils class
  • ORC-2053: Use Java Set.of instead of Collections.emptySet
  • ORC-2055: Use Java ArrayList constructors instead of Lists.newArrayList
  • ORC-2077: Introduce NullOptions class for CompressionCodec
  • ORC-2081: Support ORC LZ4 in bench module
  • ORC-2082: Support Parquet LZ4 in bench module
  • ORC-2085: Set strategy.max-parrallel to 20 for all GitHub Action jobs
  • ORC-2089: Disable Maven Parallel PUT
  • ORC-2090: Add a new label rule for MESON build
  • ORC-2091: Use HTTPS instead of HTTP
  • ORC-2111: Ensure Annotation Processing in Java compilation for Java 23+

The bug fixes:

  • ORC-1921: Upgrade Hadoop to 3.4.2
  • ORC-1966: ZSTD compress/decompress needs handle error properly
  • ORC-1967: C++ compilation issue with VS2022
  • ORC-1972: Upgrade ORC Format to 1.1.1
  • ORC-1973: Use int64_t instead of google::protobuf::int64 for Protobuf v22+
  • ORC-1974: Use google::protobuf::TextFormat instead of DebugString for Protobuf v30+
  • ORC-1977: Add Deprecated annotations for all deprecated APIs
  • ORC-2007: Upgrade gson to 2.13.2
  • ORC-2010: Use IANA Identifier America/Los_Angeles instead of US/Pacific in Java
  • ORC-2011: Fix Timezone to support legacy US TimeZone identifiers
  • ORC-2024: Upgrade zstd-jni to 1.5.7-5
  • ORC-2027: Undefined behavior in DoubleColumnReader::readFloat()
  • ORC-2028: evictEntriesBefore has deleted buffers used in unfinished coroutines causes panic
  • ORC-2032: Upgrade zstd-jni to 1.5.7-6
  • ORC-2042: Upgrade maven version to 3.9.12
  • ORC-2051: Fix Meson build to use ORC Format 1.1.1
  • ORC-2054: Fix Meson build version string to 2.3.0-SNAPSHOT
  • ORC-2069: Fix convert tool failed to read csv
  • ORC-2078: Fix TestConverter to respect test.tmp.dir
  • ORC-2087: Upgrade zstd-jni to 1.5.7-7
  • ORC-2103: Update CMake requirements to 3.25+ consistently
  • ORC-2105: Fix orc-format.wrap to use ORC Format 1.1.1

The test changes:

  • ORC-2112: Use Java 25 for Ubuntu 26.04 docker test
  • ORC-1968: Upgrade commons-cli to 1.10.0
  • ORC-1982: Upgrade brotli4j to 1.20.0
  • ORC-1992: Bump opencsv to 5.12.0
  • ORC-2040: Upgrade commons-cli to 1.11.0
  • ORC-2068: Upgrade Hadoop to 3.4.3
  • ORC-2084: Upgrade mockito to 5.21.0
  • ORC-1924: Add Windows 2025 GitHub Action job
  • ORC-1964: [CI] Fix CI ubsan-test with GNU
  • ORC-1965: Ban org.apache.commons.lang package
  • ORC-1970: [CI] Update cpp-linter-action to f91c446a32ae3eb9f98fef8c9ed4c7cb613a4f8a
  • ORC-1979: Upgrade commons-csv to 1.14.1
  • ORC-1980: Upgrade junit to 5.13.4
  • ORC-1983: Upgrade gtest to 1.17.0
  • ORC-1984: Add debian13 to docker tests, docs, and GitHub Action
  • ORC-1987: Upgrade Spark to 4.0.1 in bench module
  • ORC-1988: Upgrade Parquet to 1.16.0 in bench module
  • ORC-1989: Upgrade Hive to 4.1.0 in bench module
  • ORC-1990: Upgrade bcpkix-jdk18on to 1.81
  • ORC-1991: Upgrade snappy-java to 1.1.10.8 in bench module
  • ORC-1993: Upgrade spotless-maven-plugin to 2.46.1
  • ORC-1995: Add MacOS 26 to GitHub Action CI and docs
  • ORC-1996: Remove MacOS 13 from GitHub Action CI and docs
  • ORC-1997: Add a daily build-and-test GitHub Action Job for main branch
  • ORC-1998: Use Java 25 instead of 25-ea
  • ORC-1999: Upgrade Checkstyle to 11.0.1
  • ORC-2003: Upgrade guava to 33.5.0-jre
  • ORC-2004: Upgrade bouncycastle to 1.82
  • ORC-2005: Upgrade spotbugs-maven-plugin to 4.9.6
  • ORC-2006: Upgrade maven-shade-plugin to 3.6.1
  • ORC-2012: Remove US timezone workaround from Debian 13 Docker image
  • ORC-2015: Remove Debian 11 Support
  • ORC-2016: Upgrade CMake to 3.26.0 in amazonlinux:2023
  • ORC-2017: Upgrade checkstyle to 11.1.0
  • ORC-2018: Upgrade spotless-maven-plugin to 3.0.0
  • ORC-2019: Upgrade commons-lang3 to 3.19.0
  • ORC-2020: Upgrade junit to 6.0.0
  • ORC-2026: Upgrade maven-enforcer-plugin to 3.6.2
  • ORC-2034: Upgrade Checkstyle to 12.1.0
  • ORC-2037: Upgrade Spark to 4.1.0 and Scala to 2.13.17
  • ORC-2039: Upgrade junit to 6.0.1
  • ORC-2045: Upgrade checkstyle to 12.3.0
  • ORC-2050: Add MacOS 26 to meson/macos-cpp-check and use mainly in build
  • ORC-2056: Remove MacOS 14 from GitHub Action CI and docs
  • ORC-2058: Upgrade commons-lang3 to 3.20.0
  • ORC-2059: Upgrade spotless-maven-plugin to 3.1.0
  • ORC-2061: Upgrade byte-buddy to 1.18.4
  • ORC-2062: Upgrade objenesis to 3.5
  • ORC-2063: Upgrade Spark to 4.1.1
  • ORC-2064: Update oraclelinux9 to use dnf instead of yum
  • ORC-2065: Bump parquet to 1.17.0
  • ORC-2066: Upgrade spotbugs-maven-plugin to 4.9.8.2
  • ORC-2067: Upgrade junit to 6.0.2
  • ORC-2070: Add oraclelinux10 to docker tests and GitHub Action
  • ORC-2071: Upgrade spotless-maven-plugin to 3.2.1
  • ORC-2072: Remove OracleLinux 8 Support
  • ORC-2073: Fix JSONArgsRecommended warnings of Dockerfile
  • ORC-2074: Reduce GitHub Action concurrency
  • ORC-2076: Use license-check to check java directory
  • ORC-2079: Add lz4 codec pool test coverage
  • ORC-2086: Upgrade Spark to 4.2.0-preview2 and Netty to 4.2.10.Final
  • ORC-2088: Upgrade maven-dependency-plugin to 3.10.0
  • ORC-2092: Add ubuntu26 to docker tests and GitHub Action
  • ORC-2097: Make actions/* GitHub Actions jobs up-to-date
  • ORC-2098: Exclude .mvn/maven.config for apache-rat-plugin
  • ORC-2099: Remove Ubuntu 24.04 Support
  • ORC-2101: Enable GitHub Action CI in branch-2.3
  • ORC-2104: Update amazonlinux with 2023.10.20260202.2 and use dnf
  • ORC-2110: Enable Java 25 to build and verify all tests

The tasks:

  • ORC-1891: Upgrade to Apache parent pom 34 along with maven plugins
  • ORC-1951: Setting version to 2.3.0-SNAPSHOT
  • ORC-1975: Improve merge_orc_pr.py to accept PR numbers as a CLI argument
  • ORC-1978: Upgrade maven-enforcer-plugin to 3.6.1
  • ORC-1981: Upgrade build-helper-maven-plugin to 3.6.1
  • ORC-1985: Upgrade actions/checkout to v5
  • ORC-2001: Add method descriptions to all public Java interfaces
  • ORC-2023: Upgrade maven-dependency-plugin to 3.9.0
  • ORC-2025: Upgrade extra-enforcer-rules to 1.11.0
  • ORC-2043: Upgrade maven-jar-plugin to 3.5.0
  • ORC-2044: Upgrade maven-assembly-plugin to 3.8.0
  • ORC-2047: Add .vscode to .gitignore
  • ORC-2057: Add Pandas page at Using in Python section
  • ORC-2060: Upgrade bouncycastle to 1.83
  • ORC-2080: Add create_orc_jira.py script
  • ORC-2093: Remove labeler GitHub Action job
  • ORC-2096: Remove doc dependency from build GitHub Actions job