Skip to content

HDDS-14225. [DO NOT MERGE] Upgrade RocksDB from 7.7.3 to 10.10.1#9813

Draft
smengcl wants to merge 12 commits intoapache:masterfrom
smengcl:rocksdb-v10-upgrade
Draft

HDDS-14225. [DO NOT MERGE] Upgrade RocksDB from 7.7.3 to 10.10.1#9813
smengcl wants to merge 12 commits intoapache:masterfrom
smengcl:rocksdb-v10-upgrade

Conversation

@smengcl
Copy link
Contributor

@smengcl smengcl commented Feb 24, 2026

Generated-by: GPT-5.3-Codex

What changes were proposed in this pull request?

Bump RocksDB version from 7.7.3 to 10.4.2 10.10.1. While maintaining compatibility and not breaking anything.

  1. Since RocksDB 9, BlockBasedTableOptions.format_version=6 is the default. Files written in format_version 6 cannot be read by RocksDB < 8.6.0. This PR explicitly sets the default to version 5 to be compatible with 7.7.3 in the case where Ozone gets downgraded before it gets finalized after an upgrade. ref: https://github.com/facebook/rocksdb/releases/tag/v9.0.0
  2. DO NOT MERGE until v10.9.0 or higher RocksDB JNI is available. Potential forward compatibility bug < v10.9.0: https://github.com/facebook/rocksdb/releases/tag/v10.9.1

Fix a bug where compaction with range deletion can persist kTypeMaxValid in MANIFEST as file metadata. kTypeMaxValid is not supposed to be persisted and can change as new value types are introduced. This can cause a forward compatibility issue where older versions of RocksDB don't recognize kTypeMaxValid from newer versions. A new placeholder value type kTypeTruncatedRangeDeletionSentinel is also introduced to replace kTypeMaxValid when reading existing SST files' metadata from MANIFEST. This allows us to strengthen some checks to avoid using kTypeMaxValid in the future.

  1. Currently in this PR I am using a rocksdbjni 10.10.1 fatjar I built myself and pushed to maven central snapshot. We need to properly publish it (for example, under Apache Ozone account) before we can merge this.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-14225

How was this patch tested?

  • Most RocksDB related tests are fixed.
  • Pending full CI.

@smengcl smengcl changed the title HDDS-14225. Upgrade RocksDB from 7.7.3 to 10.4.2 HDDS-14225. [DO NOT MERGE] Upgrade RocksDB from 7.7.3 to 10.4.2 Feb 24, 2026
Copy link
Contributor

@adoroszlai adoroszlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @smengcl for working on this.

We should detect OS at runtime, not build time. Single build (on any OS) should create binaries for all supported operating systems.

  1. dependencyManagement in root POM should include an entry for rocksdbjni artifact with each supported classifier, as well as the non-classified artifact:

      <dependency>
        <groupId>org.rocksdb</groupId>
        <artifactId>rocksdbjni</artifactId>
        <version>${rocksdb.version}</version>
      </dependency>
      <dependency>
        <groupId>org.rocksdb</groupId>
        <artifactId>rocksdbjni</artifactId>
        <version>${rocksdb.version}</version>
        <classifier>linux64</classifier>
      </dependency>
      <dependency>
        <groupId>org.rocksdb</groupId>
        <artifactId>rocksdbjni</artifactId>
        <version>${rocksdb.version}</version>
        <classifier>osx</classifier>
      </dependency>
      ...
  2. hdds-rocks-native should unpack all of these

  3. other modules should continue depending on platform-independent rocksdbjni (no classifier)

Thus changes in most pom.xml files are not needed, nor is rocksdbjni.classifier.

This is a workaround for RocksDB 10.4.2 thin-jar packaging. Ensure classifier JNI artifacts are present at runtime/tests and keep dependency analysis stable.

Not needed if mvnrepo rocksdbjni provides a fat jar containing native libs for all supported platforms.
…or change

RocksDB 9.10.0 changed DB::KeyMayExist behavior semantics to follow its function comment. Ozone snapshot code paths were treating keyMayExist/getIfExist misses as definitive, which could misclassify existing keys in snapshot-related flows under RocksDB >= 9.10.0 and cause test failures.

Treat keyMayExist/getIfExist as hints and fall back to point reads before deciding not-found:
- RDBTable: verify with get()/get(ByteBuffer) on inconclusive keyMayExist
- TypedTable: in codec-buffer isExist path, fallback to full get before returning false
- KeyManagerImpl and ReclaimableRenameEntryFilter: use getSkipCache fallback for snapshot rename lookups

RocksDB changelog for reference:
https://github.com/facebook/rocksdb/releases/tag/v9.10.0

Behavior Changes
DB::KeyMayExist() now follows its function comment, which means value parameter can be null, and it will be set only if value_found is passed in.
RocksDB 9.10+ may consume ByteBuffer state during keyMayExist; treat it as a hint and always use duplicated key buffers for keyMayExist/get fallback paths in RDBTable/RocksDatabase.

Add a regression unit test for ByteBuffer fallback behavior.

Also make replicas-test.sh restore the whole container.db directory (not just one file) to avoid stale RocksDB WAL/MANIFEST artifacts when recovering from backup.
@smengcl smengcl force-pushed the rocksdb-v10-upgrade branch from 431f2b2 to c9b83ef Compare March 10, 2026 05:46
@smengcl smengcl force-pushed the rocksdb-v10-upgrade branch from c9b83ef to 7270ac0 Compare March 10, 2026 18:22
@smengcl smengcl changed the title HDDS-14225. [DO NOT MERGE] Upgrade RocksDB from 7.7.3 to 10.4.2 HDDS-14225. [DO NOT MERGE] Upgrade RocksDB from 7.7.3 to 10.10.1 Mar 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants