Commit Graph

25 Commits

Author SHA1 Message Date
jaime
9630974fbb
enhance: move rocksmq from internal to pkg module (#33881)
issue: #33956

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-06-25 21:18:15 +08:00
congqixia
07c25a19d9
fix: Make querycoord panick when rg metastore sync fail (#34106)
See also #34047

When `unassignNode` sync resource group with node removed failed

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-06-24 21:38:02 +08:00
wei liu
a7f6193bfc
fix: query node may stuck at stopping progress (#33104)
issue: #33103 
when try to do stopping balance for stopping query node, balancer will
try to get node list from replica.GetNodes, then check whether node is
stopping, if so, stopping balance will be triggered for this replica.

after the replica refactor, replica.GetNodes only return rwNodes, and
the stopping node maintains in roNodes, so balancer couldn't find
replica which contains stopping node, and stopping balance for replica
won't be triggered, then query node will stuck forever due to
segment/channel doesn't move out.

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-05-20 10:21:38 +08:00
chyezh
f06509bf97
fix: get replica should not report error when no querynode serve (#32536)
issue: #30647

- Remove error report if there's no query node serve. It's hard for
programer to use it to do resource management.

- Change resource group `transferNode` logic to keep compatible with old
version sdk.

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2024-04-25 19:25:24 +08:00
chyezh
a8c8a6bb0f
fix: parameter check of TransferReplica and TransferNode (#32297)
issue: #30647 

- Same dst and src resource group should not be allowed in
`TransferReplica` and `TransferNode`.

-  Remove redundant parameter check.

Signed-off-by: chyezh <chyezh@outlook.com>
2024-04-17 15:27:19 +08:00
chyezh
48fe977a9d
enhance: declarative resource group api (#31930)
issue: #30647

- Add declarative resource group api

- Add config for resource group management

- Resource group recovery enhancement

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2024-04-15 08:13:19 +08:00
congqixia
b9a487608a
fix: Make ResourceGroup.nodes concurrent safe (#32159)
See also #32158

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-04-11 17:53:18 +08:00
chyezh
a2502bde75
enhance: replica manager enhancement (#31496)
issue: #30647 

- ReplicaManager manage read only node now, and always do persistent of
node distribution of replica.

- All segment/channel checker using ReplicaManager to get read-only node
or read-write node, but not ResourceManager.

- ReplicaManager promise that only apply unique querynode to one replica
in same collection now (replicas in same collection never hold same
querynode at same time).

- ReplicaManager promise that fairly node count assignment policy if
multi replicas of collection is assigned to one resource group.

- Move some parameters check into ReplicaManager to avoid data race.

- Allow transfer replica to resource group that already load replica of
same collection

- Allow transfer node between resource groups that load replica of same
collection

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2024-04-05 04:57:16 +08:00
chyezh
ff4237bb90
enhance: add hostname into node info (#30673)
issue: https://github.com/milvus-io/milvus/issues/30647

- Address may be reused in k8s environment. Using hostname can be
better.

Signed-off-by: chyezh <chyezh@outlook.com>
2024-03-15 10:45:06 +08:00
Enwei Jiao
fb0705df1b
Decouple basetable and componentparam (#26725)
Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>
2023-09-05 10:31:48 +08:00
Bingyi Sun
a3e22786ed
Move meta store to kv catalog (#25915)
Signed-off-by: sunby <sunbingyi1992@gmail.com>
2023-07-31 13:57:04 +08:00
yah01
948d1f1f4a
Handle errors by merr for QueryCoord (#24926)
Signed-off-by: yah01 <yang.cen@zilliz.com>
2023-07-17 14:59:34 +08:00
yiwangdr
4387f36897
make etcdKV private (#24778)
Signed-off-by: yiwangdr <yiwangdr@gmail.com>
2023-06-13 10:52:38 +08:00
wei liu
fb6e4ceee7
correct the behavior of create duplicate rg (#23963)
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-05-09 14:06:40 +08:00
jaime
c9d0c157ec
Move some modules from internal to public package (#22572)
Signed-off-by: jaime <yun.zhang@zilliz.com>
2023-04-06 19:14:32 +08:00
yah01
75737c65ac
Refine error handle of QueryCoord (#23068)
Signed-off-by: yah01 <yang.cen@zilliz.com>
2023-03-31 10:54:29 +08:00
wei liu
bb5088e605
fix unassign from rg (#22747)
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-03-16 14:27:55 +08:00
wei liu
d433e95ce0
fix transfer node meta err (#22420)
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-02-27 19:29:47 +08:00
Enwei Jiao
697dedac7e
Use cockroachdb/errors to replace other error pkg (#22390)
Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>
2023-02-26 11:31:49 +08:00
wei liu
a9a263d5a8
fix assign node to replica in nodeUp (#22323)
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-02-23 14:15:45 +08:00
wei liu
87a4ddc7e2
fix rg e2e (#22187)
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-02-16 10:48:34 +08:00
wei liu
7b4511b8f4
fix transfer node (#22120)
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-02-14 16:16:34 +08:00
wei liu
e9684ddb5a
refine rg capacity behavior (#22089)
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-02-09 18:46:31 +08:00
wei liu
5da6aade02
fix default capacity of default rg (#21965)
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-02-06 15:55:56 +08:00
wei liu
73c44d4b29
resource group impl (#21609)
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-01-30 10:19:48 +08:00