milvus

mirror of https://gitee.com/milvus-io/milvus.git synced 2024-12-04 12:59:23 +08:00

Author	SHA1	Message	Date
chyezh	f06509bf97	fix: get replica should not report error when no querynode serve (#32536 ) issue: #30647 - Remove error report if there's no query node serve. It's hard for programer to use it to do resource management. - Change resource group `transferNode` logic to keep compatible with old version sdk. --------- Signed-off-by: chyezh <chyezh@outlook.com>	2024-04-25 19:25:24 +08:00
congqixia	d7ff1bbe5c	enhance: Make querycoordv2 collection observer task driven (#32441 ) See also #32440 - Add loadTask in collection observer - For load collection/partitions, load task shall timeout as a whole - Change related constructor to load jobs --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-04-22 10:39:22 +08:00
chyezh	48fe977a9d	enhance: declarative resource group api (#31930 ) issue: #30647 - Add declarative resource group api - Add config for resource group management - Resource group recovery enhancement --------- Signed-off-by: chyezh <chyezh@outlook.com>	2024-04-15 08:13:19 +08:00
chyezh	a2502bde75	enhance: replica manager enhancement (#31496 ) issue: #30647 - ReplicaManager manage read only node now, and always do persistent of node distribution of replica. - All segment/channel checker using ReplicaManager to get read-only node or read-write node, but not ResourceManager. - ReplicaManager promise that only apply unique querynode to one replica in same collection now (replicas in same collection never hold same querynode at same time). - ReplicaManager promise that fairly node count assignment policy if multi replicas of collection is assigned to one resource group. - Move some parameters check into ReplicaManager to avoid data race. - Allow transfer replica to resource group that already load replica of same collection - Allow transfer node between resource groups that load replica of same collection --------- Signed-off-by: chyezh <chyezh@outlook.com>	2024-04-05 04:57:16 +08:00
wei liu	92971707de	enhance: Add restful api for devops to execute rolling upgrade (#29998 ) issue: #29261 This PR Add restful api for devops to execute rolling upgrade, including suspend/resume balance and manual transfer segments/channels. --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-03-27 16:15:19 +08:00
chyezh	ff4237bb90	enhance: add hostname into node info (#30673 ) issue: https://github.com/milvus-io/milvus/issues/30647 - Address may be reused in k8s environment. Using hostname can be better. Signed-off-by: chyezh <chyezh@outlook.com>	2024-03-15 10:45:06 +08:00
congqixia	c886aa29ff	enhance: Use `ListIndexes` instead of `DescribeIndex` for qc broker (#31122 ) See also #31103 Since querycoord need index meta information from datacoord only, broker shall use `ListIndexes` to skip segment index building check logic in datacoord This PR is also related to #30538, in which DescribeIndex caused lots of memory usage and lead to OOM eventually --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-03-07 21:43:03 +08:00
wei liu	9abc868d15	fix: Remove heartbeat lag logic during get shard leaders (#29999 ) issue: #29677 #29838 during get shard leaders, if qeurynode doesn't ack the heartbeat than 10s, querycoord will treat it as unavailable, and won't return shard leader on it. but when querynode has a full cpu usage, it's easily to stuck for more than 10s without ack the heartbeat, which cause no shard leader to search/query. This PR remove heartbeat lag logic during get shard leaders Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-01-17 11:22:52 +08:00
wei liu	e98c62abbb	enhance: refactor leader_observer to leader_checker (#29454 ) issue: #29453 sync distribution by rpc will also call loadSegment/releaseSegment, which may cause all kinds of concurrent case on same segment, such as concurrent load and release on one segment. This PR add leader_checker which generate load/release task to correct the leader view, instead of calling sync distribution by rpc --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-01-05 15:54:55 +08:00
yah01	bfccfcd0ca	enhance: refine error messages (#28424 ) - Split the simple reason and full detail - Refine existing error messages related: #28422 --------- Signed-off-by: yah01 <yah2er0ne@outlook.com>	2023-11-21 17:02:24 +08:00
yah01	1b90630633	Fix the target updated before version updated to cause data missing (#28250 ) Signed-off-by: yah01 <yah2er0ne@outlook.com>	2023-11-08 11:36:22 +08:00
yah01	dc89730a50	Support collection-level mmap control (#26901 ) Signed-off-by: yah01 <yah2er0ne@outlook.com>	2023-11-02 23:52:16 +08:00
Filip Haltmayer	6b1a106a31	Moving etcd client into session (#27069 ) Signed-off-by: Filip Haltmayer <filip.haltmayer@zilliz.com>	2023-10-27 07:36:12 +08:00
wei liu	e0222b2ce3	refine target manager code style (#27883 ) Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2023-10-25 00:44:12 +08:00
yah01	be980fbc38	Refine state check (#27541 ) Signed-off-by: yah01 <yah2er0ne@outlook.com>	2023-10-11 21:01:35 +08:00
yah01	a8ce1b6686	Refine QueryCoord stopping (#27371 ) Signed-off-by: yah01 <yah2er0ne@outlook.com>	2023-09-27 16:27:27 +08:00
yah01	6539a5ae2c	Refine DataCoord status (#27262 ) Signed-off-by: yah01 <yah2er0ne@outlook.com>	2023-09-26 17:15:27 +08:00
MrPresent-Han	4b12cb8847	fix unstable ut due to unstable sort of unique set (#27302 ) Signed-off-by: MrPresent-Han <chun.han@zilliz.com>	2023-09-22 19:07:26 +08:00
SimFG	26f06dd732	Format the code (#27275 ) Signed-off-by: SimFG <bang.fu@zilliz.com>	2023-09-21 09:45:27 +08:00
yah01	941a383019	Fix failed to load collection with more than 128 partitions (#26763 ) Signed-off-by: yah01 <yah2er0ne@outlook.com>	2023-09-02 00:09:01 +08:00
wei liu	949c320185	remove pull target from qc recover (#26775 ) Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2023-09-01 11:17:01 +08:00
Bingyi Sun	a3e22786ed	Move meta store to kv catalog (#25915 ) Signed-off-by: sunby <sunbingyi1992@gmail.com>	2023-07-31 13:57:04 +08:00
yah01	dc37b4587e	Fix panic if channel not watched while getting shard leaders (#25820 ) Signed-off-by: yah01 <yang.cen@zilliz.com>	2023-07-24 14:13:02 +08:00
yah01	948d1f1f4a	Handle errors by merr for QueryCoord (#24926 ) Signed-off-by: yah01 <yang.cen@zilliz.com>	2023-07-17 14:59:34 +08:00
wei liu	68ae199a9f	load segment with target version, avoid read redundant segment (#24929 ) Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2023-06-27 11:48:45 +08:00
congqixia	41af0a98fa	Use go-api/v2 for milvus-proto (#24770 ) Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-06-09 01:28:37 +08:00
wei liu	8e3ba74648	fix qc service unstable ut (#24340 ) Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2023-05-24 18:49:25 +08:00
wei liu	8965ea2a08	refine err msg about no available node in replica (#24256 ) Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2023-05-22 11:59:26 +08:00
yihao.dai	1a3dca9b5e	Fix dynamic partitions loading (#24112 ) Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2023-05-18 09:17:23 +08:00
smellthemoon	146050db82	Fix some wrong ut (#23990 ) Signed-off-by: lixinguo <xinguo.li@zilliz.com> Co-authored-by: lixinguo <xinguo.li@zilliz.com>	2023-05-10 09:31:19 +08:00
yihao.dai	3827ac30bc	Remove load cache (#23287 ) Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2023-05-09 10:36:41 +08:00
congqixia	ed81eaa963	Make CollectionObserver trigger checker more frequently during load procedure (#23928 ) Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-05-08 14:06:41 +08:00
wei liu	b6ae70db43	fix get replica return wrong node list (#23792 ) Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2023-04-28 19:48:36 +08:00
XuanYang-cn	d56771b7b7	Fix return too many nodeIDs (#23397 ) See also: #23396 Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2023-04-20 13:50:31 +08:00
wei liu	cbfe7a45ef	fix pull target (#23491 ) Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2023-04-18 18:30:32 +08:00
yah01	296380d6e6	Support async refresh (#23107 ) Signed-off-by: yah01 <yang.cen@zilliz.com>	2023-04-12 15:06:28 +08:00
jaime	c9d0c157ec	Move some modules from internal to public package (#22572 ) Signed-off-by: jaime <yun.zhang@zilliz.com>	2023-04-06 19:14:32 +08:00
yah01	75737c65ac	Refine error handle of QueryCoord (#23068 ) Signed-off-by: yah01 <yang.cen@zilliz.com>	2023-03-31 10:54:29 +08:00
wei liu	74da53c027	fix update load percentage (#23054 ) Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2023-03-30 10:48:23 +08:00
yihao.dai	1f718118e9	Dynamic load/release partitions (#22655 ) Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2023-03-20 14:55:57 +08:00
yah01	3d8f0156c7	Refine scheduler & executor of QueryCoord (#22761 ) Signed-off-by: yah01 <yang.cen@zilliz.com>	2023-03-16 17:43:55 +08:00
yah01	1a4732bb19	Use new errors to handle load failures cache (#22672 ) Signed-off-by: yah01 <yang.cen@zilliz.com>	2023-03-10 17:15:54 +08:00
wei liu	11f1f4226a	support replica observer assign node (#22604 ) Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2023-03-08 18:57:51 +08:00
wei liu	c162c6ecc0	fix assign node err (#22479 ) Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2023-03-01 11:11:47 +08:00
wei liu	a9a263d5a8	fix assign node to replica in nodeUp (#22323 ) Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2023-02-23 14:15:45 +08:00
wei liu	c3e8ad3629	fix balance generate reduce task (#22236 ) Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2023-02-21 19:06:27 +08:00
wei liu	87a4ddc7e2	fix rg e2e (#22187 ) Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2023-02-16 10:48:34 +08:00
wei liu	7b4511b8f4	fix transfer node (#22120 ) Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2023-02-14 16:16:34 +08:00
wei liu	73c44d4b29	resource group impl (#21609 ) Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2023-01-30 10:19:48 +08:00
Ten Thousand Leaves	defb7660c8	Add refresh option for LoadCollection/LoadPartitions interfaces (#21776 ) /kind improvement Signed-off-by: Yuchen Gao <yuchen.gao@zilliz.com>	2023-01-18 16:41:44 +08:00

1 2

73 Commits