Commit Graph

40 Commits

Author SHA1 Message Date
wei liu
8123bea1ae
enhance: Avoid assign too much segment/channels to new querynode (#34096)
issue: #34095

When a new query node comes online, the segment_checker,
channel_checker, and balance_checker simultaneously attempt to allocate
segments to it. If this occurs during the execution of a load task and
the distribution of the new query node hasn't been updated, the query
coordinator may mistakenly view the new query node as empty. As a
result, it assigns segments or channels to it, potentially overloading
the new query node with more segments or channels than expected.

This PR measures the workload of the executing tasks on the target query
node to prevent assigning an excessive number of segments to it.

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-06-27 19:06:05 +08:00
jaime
9630974fbb
enhance: move rocksmq from internal to pkg module (#33881)
issue: #33956

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-06-25 21:18:15 +08:00
wei liu
a7f6193bfc
fix: query node may stuck at stopping progress (#33104)
issue: #33103 
when try to do stopping balance for stopping query node, balancer will
try to get node list from replica.GetNodes, then check whether node is
stopping, if so, stopping balance will be triggered for this replica.

after the replica refactor, replica.GetNodes only return rwNodes, and
the stopping node maintains in roNodes, so balancer couldn't find
replica which contains stopping node, and stopping balance for replica
won't be triggered, then query node will stuck forever due to
segment/channel doesn't move out.

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-05-20 10:21:38 +08:00
wei liu
4822b109bd
fix: Skip to load l0 segment on old version query node (#32124)
issue: #32107

during rolling upgrade progress, skip to load l0 segment on old version
query node

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-04-15 11:23:23 +08:00
chyezh
48fe977a9d
enhance: declarative resource group api (#31930)
issue: #30647

- Add declarative resource group api

- Add config for resource group management

- Resource group recovery enhancement

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2024-04-15 08:13:19 +08:00
chyezh
a2502bde75
enhance: replica manager enhancement (#31496)
issue: #30647 

- ReplicaManager manage read only node now, and always do persistent of
node distribution of replica.

- All segment/channel checker using ReplicaManager to get read-only node
or read-write node, but not ResourceManager.

- ReplicaManager promise that only apply unique querynode to one replica
in same collection now (replicas in same collection never hold same
querynode at same time).

- ReplicaManager promise that fairly node count assignment policy if
multi replicas of collection is assigned to one resource group.

- Move some parameters check into ReplicaManager to avoid data race.

- Allow transfer replica to resource group that already load replica of
same collection

- Allow transfer node between resource groups that load replica of same
collection

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2024-04-05 04:57:16 +08:00
wei liu
92971707de
enhance: Add restful api for devops to execute rolling upgrade (#29998)
issue: #29261
This PR Add restful api for devops to execute rolling upgrade, including
suspend/resume balance and manual transfer segments/channels.

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-03-27 16:15:19 +08:00
chyezh
9f9ef8ac32
enhance: transfer resource group and dbname to querynode when load (#30936)
issue: #30931

Signed-off-by: chyezh <chyezh@outlook.com>
2024-03-21 11:59:12 +08:00
chyezh
ff4237bb90
enhance: add hostname into node info (#30673)
issue: https://github.com/milvus-io/milvus/issues/30647

- Address may be reused in k8s environment. Using hostname can be
better.

Signed-off-by: chyezh <chyezh@outlook.com>
2024-03-15 10:45:06 +08:00
wei liu
06df9b8462
fix: Balance segment/channel won't be trigger on multi replicas (#31107)
issue: #30983 #30982

cause balancer call wrong interface to get segment/channel list in
replica, then got a wrong average segment/channel number, which make
each node have less segment/channel than average, and the balance won't
be trigger in multi replica case.

This PR fix that balance segment/channel won't be trigger on multi
replicas

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-03-11 20:35:04 +08:00
wei liu
820ee692fc
enhance: Add config for querycoord auto balance channel (#29231)
issue: #23726
This PR add control config to querycoord's background auto balance
channel operation

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-12-18 10:00:40 +08:00
wei liu
008bae675d
enhance: Skip balance segment when channel need be balanced (#29116)
issue: #28622
After we support balance segment with growing segment count #28623, if
we balance segment and channel at same time, some segments need to be
rebalanced after balance channel finish.

This PR skip balance segment when channel need be balanced.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-12-14 16:44:43 +08:00
congqixia
a67fc08865
fix: balance_unstable_view unit test (#29127)
fix: #29126
Allow unstable output channel balance plan

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-12-12 10:02:39 +08:00
wei liu
42e538b683
enhance: enable balance channel in querycoord (#28469)
issue: #23726

/kind improvement

1. enable auto balance channel between nodes in querycoord
2. make `genSegmentPlan` reuse the `AssignSegment` logic
3. make `genChannelPlan` reuse the `AssignChannel` logic

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-12-11 14:18:37 +08:00
wei liu
911a915798
feat: enable balance based on growing segment row count (#28623)
issue: #28622 

query node with delegator will has more rows than other query node due
to delgator loads all growing rows.
This PR enable the balance segment which based on the num of growing
rows in leader view.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-11-27 14:58:26 +08:00
SimFG
26f06dd732
Format the code (#27275)
Signed-off-by: SimFG <bang.fu@zilliz.com>
2023-09-21 09:45:27 +08:00
Enwei Jiao
fb0705df1b
Decouple basetable and componentparam (#26725)
Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>
2023-09-05 10:31:48 +08:00
wei liu
6f89620a43
remove pull target rpc from lock (#26054)
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-08-04 10:31:06 +08:00
Bingyi Sun
a3e22786ed
Move meta store to kv catalog (#25915)
Signed-off-by: sunby <sunbingyi1992@gmail.com>
2023-07-31 13:57:04 +08:00
yiwangdr
4387f36897
make etcdKV private (#24778)
Signed-off-by: yiwangdr <yiwangdr@gmail.com>
2023-06-13 10:52:38 +08:00
MrPresent-Han
b517bc9e6a
refine balance mechanism including:(#23454) (#23763) (#23791)
1. balance granuity to replica to avoid influence unrelated replicas
2. avoid balance back and forth

Signed-off-by: MrPresent-Han <jamesharden11122@gmail.com>
2023-05-04 12:22:40 +08:00
wei liu
5244020336
ban auto balance channel (#23725)
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-04-26 19:26:39 +08:00
wei liu
6653e2c3b0
fix balance channel (#23631)
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-04-25 10:22:37 +08:00
wei liu
3933080511
skip to balance redundant segment (#23490)
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-04-18 18:32:32 +08:00
wei liu
cbfe7a45ef
fix pull target (#23491)
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-04-18 18:30:32 +08:00
wei liu
dbbd703667
fix balance generate unexpected task (#23299)
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-04-11 14:38:30 +08:00
wei liu
9f127dae47
enable balance channel (#23227)
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-04-07 19:06:28 +08:00
jaime
c9d0c157ec
Move some modules from internal to public package (#22572)
Signed-off-by: jaime <yun.zhang@zilliz.com>
2023-04-06 19:14:32 +08:00
MrPresent-Han
afd874b736
enhance segment balance by considering global rowCount(##22914) (#23056)
Signed-off-by: MrPresent-Han <jamesharden11122@gmail.com>
Co-authored-by: xiaofan-luan <xiaofan.luan@zilliz.com>
2023-04-03 14:16:25 +08:00
yihao.dai
1f718118e9
Dynamic load/release partitions (#22655)
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2023-03-20 14:55:57 +08:00
wei liu
c3e8ad3629
fix balance generate reduce task (#22236)
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-02-21 19:06:27 +08:00
wei liu
73c44d4b29
resource group impl (#21609)
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-01-30 10:19:48 +08:00
SimFG
6a29a964df
Fix queryCoord panic during query node down (#21400)
Signed-off-by: SimFG <bang.fu@zilliz.com>
2022-12-28 10:17:30 +08:00
SimFG
f8cff79804
Support the graceful stop for the query node (#20851)
Signed-off-by: SimFG <bang.fu@zilliz.com>

Signed-off-by: SimFG <bang.fu@zilliz.com>
2022-12-06 22:59:19 +08:00
Enwei Jiao
2ecdb4ba4a
Etcd config source support TLS (#20874)
Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>

Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>
2022-11-30 18:23:15 +08:00
smellthemoon
8283d32ac4
Only balance segement in targets (#20709)
Signed-off-by: lixinguo <xinguo.li@zilliz.com>

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2022-11-21 16:21:11 +08:00
Enwei Jiao
c05b9ad539
Add event dispatcher for config (#20393)
Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>

Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>
2022-11-17 18:59:09 +08:00
wei liu
7537dbfa37
skip balance on loading collection (#20483)
Signed-off-by: Wei Liu <wei.liu@zilliz.com>

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2022-11-10 17:53:04 +08:00
yah01
1c71844b8d
Add license header (#19678)
Signed-off-by: yah01 <yang.cen@zilliz.com>

Signed-off-by: yah01 <yang.cen@zilliz.com>
2022-10-11 11:39:22 +08:00
Bingyi Sun
626854cf0c
Refactor QueryCoord (#18836)
Signed-off-by: sunby <bingyi.sun@zilliz.com>
Co-authored-by: yah01 <yang.cen@zilliz.com>
Co-authored-by: Wei Liu <wei.liu@zilliz.com>
Co-authored-by: Congqi Xia <congqi.xia@zilliz.com>

Signed-off-by: sunby <bingyi.sun@zilliz.com>
Co-authored-by: sunby <bingyi.sun@zilliz.com>
Co-authored-by: yah01 <yang.cen@zilliz.com>
Co-authored-by: Wei Liu <wei.liu@zilliz.com>
Co-authored-by: Congqi Xia <congqi.xia@zilliz.com>
2022-09-15 18:48:32 +08:00