Skip to content

HDDS-11855. Fix flaky TestContainerBalancerDatanodeNodeLimit#checkIterationResultException#10667

Merged
adoroszlai merged 1 commit into
apache:masterfrom
chihsuan:HDDS-11855
Jul 5, 2026
Merged

HDDS-11855. Fix flaky TestContainerBalancerDatanodeNodeLimit#checkIterationResultException#10667
adoroszlai merged 1 commit into
apache:masterfrom
chihsuan:HDDS-11855

Conversation

@chihsuan

@chihsuan chihsuan commented Jul 5, 2026

Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

checkIterationResultException failed intermittently (12/100 in CI) with actual 0L >= 3L. The test asserts that at least 3 container moves fail, but the number of moves the balancer schedules depends on the cluster layout, which TestableCluster builds from an unseeded random generator. On some layouts, fewer than 3 moves are scheduled, so the fixed count of 3 is not something the test can rely on.

Fix: Assert what is true for every layout instead of a fixed number. All moves are mocked to fail, so none can be counted as completed (completed == 0); and whenever a move is scheduled, it must be recorded as either a failure or a timeout. The @Flaky marker is removed.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-11855

How was this patch tested?

@adoroszlai adoroszlai changed the title HDDS-11855. Fix intermittent failure in checkIterationResultException HDDS-11855. Fix flaky TestContainerBalancerDatanodeNodeLimit#checkIterationResultException Jul 5, 2026
@adoroszlai adoroszlai added the test label Jul 5, 2026
@adoroszlai adoroszlai merged commit f5a5f0f into apache:master Jul 5, 2026
30 of 32 checks passed
@adoroszlai

Copy link
Copy Markdown
Contributor

Thanks @chihsuan for fixing all known issues in TestContainerBalancerDatanodeNodeLimit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants