issue: #33103
when try to do stopping balance for stopping query node, balancer will
try to get node list from replica.GetNodes, then check whether node is
stopping, if so, stopping balance will be triggered for this replica.
after the replica refactor, replica.GetNodes only return rwNodes, and
the stopping node maintains in roNodes, so balancer couldn't find
replica which contains stopping node, and stopping balance for replica
won't be triggered, then query node will stuck forever due to
segment/channel doesn't move out.
---------
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
issue: #31468
1. when segment's version in leader view doesn't match segment's version
in dist, should update leader view
2. after call loadDeltalog, should update segment's load version with
latest ts
3. change leader task's priority from high to low, to avoid leader task
replace segment task and balance task
---------
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
issue: #30186
during channel balance, after new delegator loaded, instead of syncing
l0 segment's location to new delegator, we should load l0 segment on new
delegator, and release the old l0 segment, then start to release old
delegator.
---------
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
issue: #30890
when leader checker find that leader view has an older load version of
segment, it will try to correct leader view. but the sync action doesn't
specify the latest load version. so the update operation will failed.
This PR fix leader checker can't update segment's load version and
keeping generate same task to scheduler.
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
See also #30150
For leader view distribution with offline nodes, a release task can
never be sent to querynode due to targetNode online check logic. Even
the request is dispatched, normal release task does not have "force"
flag when calling `delegator.ReleaseSegment`.
This PR adds a new type of querycoord task: LeaderTask, the
responsibility of which is to rectify leader view distribtion.
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
issue: #30150
This PR fix three problems:
1. leader checker use wrong node id when generate release task, which
cause the release task finished immediately
2. the release request generated by leader_checker doesn't set the
`force` flag, the operation to clean leader view on delegator will fail.
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
issue: #29453
sync distribution by rpc will also call loadSegment/releaseSegment,
which may cause all kinds of concurrent case on same segment, such as
concurrent load and release on one segment.
This PR add leader_checker which generate load/release task to correct
the leader view, instead of calling sync distribution by rpc
---------
Signed-off-by: Wei Liu <wei.liu@zilliz.com>