-
Notifications
You must be signed in to change notification settings - Fork 580
Description
Bug Type (问题类型)
others (please edit later)
Before submit
- 我已经确认现有的 Issues 与 FAQ 中没有相同 / 重复问题 (I have confirmed and searched that there are no similar problems in the historical issue and documents)
Environment (环境信息)
- Server Version: 1.7.0
- Backend: hstore (3 PD + 3 Store + 3 Server)
- OS: macOS Apple M4
- related pr - fix(docker): migrate single-node compose from host to bridge networking #2952
- Network: Docker bridge mode (static IPs via ipam)
- Docker Desktop: latest
Summary
In a 3-node PD cluster running in Docker bridge network mode, the cluster
only works correctly when pd0 wins the raft leader election. If pd1 or pd2
becomes leader, store registration fails, partitions are never distributed,
and HugeGraph servers cannot initialize.
Root Cause
In RaftEngine.java, getLeaderGrpcAddress() makes a live bolt RPC call to
discover the leader's gRPC address when the current node is a follower:
return raftRpcClient.getGrpcAddress(
raftNode.getLeaderId().getEndpoint().toString()
).get().getGrpcAddress(); // .get() returns null in bridge mode → NPE
This call fails in Docker bridge mode — the TCP connection establishes
successfully but the bolt RPC response never returns, causing
CompletableFuture.get() to return null and throw NPE.
This causes:
- redirectToLeader() fails with NPE
- Store registration requests landing on follower PDs are never forwarded
- Stores register but partitions are never distributed (partitionCount:0)
- HugeGraph servers stuck in DEADLINE_EXCEEDED loop indefinitely
Why It Only Affects Bridge Mode
In host network mode all PD nodes communicate via 127.0.0.1 — the bolt RPC
call succeeds instantly over loopback. In bridge mode the call traverses
Docker's virtual network and the response never arrives properly.
Why It's Nondeterministic
JRaft leader election is timing-based. If pd0 wins, isLeader() returns true
and the broken code path is never reached. If pd1 or pd2 wins, pd0 becomes
a follower and hits the NPE on every redirect attempt.
Reproduction
- Run the 3-node cluster in Docker bridge mode
- Check which PD won leader election:
docker exec hg-pd0 grep "becomes leader" /hugegraph-pd/logs/hugegraph-pd-stdout.log
docker exec hg-pd1 grep "becomes leader" /hugegraph-pd/logs/hugegraph-pd-stdout.log
docker exec hg-pd2 grep "becomes leader" /hugegraph-pd/logs/hugegraph-pd-stdout.log - If pd1 or pd2 is leader, check store partitions:
curl -u store:admin http://localhost:8620/v1/stores | grep partitionCount
→ store1 and store2 will show partitionCount:0 - Check pd0 logs for the NPE:
docker exec hg-pd0 grep "getLeaderGrpcAddress" /hugegraph-pd/logs/hugegraph-pd-stdout.log
Evidence
The NPE occurs in two separate call paths:
Path 1 — fires immediately when node becomes follower during leader election:
java.util.concurrent.ExecutionException: java.lang.NullPointerException
at java.util.concurrent.CompletableFuture.reportGet(Unknown Source)
at java.util.concurrent.CompletableFuture.get(Unknown Source)
at org.apache.hugegraph.pd.raft.RaftEngine.getLeaderGrpcAddress(RaftEngine.java:242)
at org.apache.hugegraph.pd.service.PDService.onRaftLeaderChanged(PDService.java:1345)
at org.apache.hugegraph.pd.raft.RaftStateMachine.lambda$onStartFollowing$1(RaftStateMachine.java:141)
**Path 2 — fires on every store registration redirect attempt:**
java.util.concurrent.ExecutionException: java.lang.NullPointerException
at java.util.concurrent.CompletableFuture.reportGet(Unknown Source)
at java.util.concurrent.CompletableFuture.get(Unknown Source)
at org.apache.hugegraph.pd.raft.RaftEngine.getLeaderGrpcAddress(RaftEngine.java:242)
at org.apache.hugegraph.pd.service.PDService.redirectToLeader(PDService.java:1275)
proof: restarting the non-pd0 leader forcing pd0 to win election
immediately resolved the issue — partitionCount:12 on all 3 stores, all 9
containers healthy. Reproducible 100% of the time.