milvus

mirror of https://gitee.com/milvus-io/milvus.git synced 2024-12-05 05:18:52 +08:00

Author	SHA1	Message	Date
wei liu	3cd0b26285	enhance: Enable dynamic update loaded collection's replica (#35822 ) issue: #35821 After collection loaded, if we need to increase/decrease collection's replica, we need to release and load it again. milvus offers 4 solution to update loaded collection's replica, this PR aims to dynamic change the replica number without release, and after replica number changed, milvus will execute load replica or release replica in async, and the replica loaded status can be checked by getReplicas API. Notice that if set too much replicas than querynode can afford，the new replica won't be loaded successfully until enough querynode joins. --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-09-25 10:13:18 +08:00
congqixia	f985173da0	fix: Fill load field list from old version load info (#35993 ) See also #35959 --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-09-05 16:57:05 +08:00
wei liu	c84ea5465c	fix: Fix some replicas don't participate in the query after the failure recovery (#35850 ) issue: #35846 querycoord will notify proxy to update shard leader cache after delegator location changes, but during querynode's failure recovery, some delegator may become unserviceable due to lacking of segments, and back to serviceable after segment loaded, so we also need to notify proxy to invalidate shard leader cache when delegator serviceable state changes. This PR will maintain querynode's serviceable state during heartbeat, and notify proxy to invalidate shard leader cache if serviceable state changes. Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-09-03 15:39:03 +08:00
Xiaofan	0dc5e89007	enhance: reduce the log level of frequent log (#35652 ) fix #35651 Signed-off-by: xiaofanluan <xiaofan.luan@zilliz.com>	2024-08-25 16:20:57 +08:00
congqixia	2fbc628994	feat: Support field partial load collection (#35416 ) Related to #35415 --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-08-20 16:49:02 +08:00
wei liu	c0200eec39	enhance: limit getSegmentInfo batch size to avoid excced grpc message limit (#35394 ) issue: #35395 Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-08-15 19:17:00 +08:00
SimFG	b2cc4b0776	feat: add the rbac msg and send them to the replicate channel (#35392 ) - issue: #35391 Signed-off-by: SimFG <bang.fu@zilliz.com>	2024-08-15 12:06:52 +08:00
wei liu	344dc6a9f8	enhance: enable to set load config in cluster level (#35169 ) issue: #35170 This PR enable to set load configs in cluster level, such as replicas and resource groups. then when load collections will use the load config. Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-08-07 12:38:21 +08:00
Chun Han	3faef63a25	enhance: add log for partition stats( #30376 ) (#35219 ) related: #30376 Signed-off-by: MrPresent-Han <chun.han@gmail.com> Co-authored-by: MrPresent-Han <chun.han@gmail.com>	2024-08-02 19:34:22 +08:00
jaime	fcec4c21b9	fix: check collection health(queryable) fail for releasing collection (#34947 ) issue: #34946 Signed-off-by: jaime <yun.zhang@zilliz.com>	2024-08-02 17:20:15 +08:00
wei liu	3dd3749f0b	enhance: Avoid unnecesary syncTargetVersion func call after querycoord recover (#34954 ) before querycoord stop gracefully, we will save the current target to meta store and recover it after querycoord start up, to speed the querycoord's recovery time. but the target version hasn't been recovered as expected, and it use latest timestamp as current target's version, which has no effect to querycoord but an unnecessary syncTargetVersion func call. This PR recover the correct target version to avoid unnecessary syncTargetVersion func call Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-08-02 14:46:13 +08:00
wei liu	27b6d58981	fix: Set legacy level to l0 segment after qc restart (#35197 ) issue: #35087 after qc restarts, and target is not ready yet, if dist_handler try to update segment dist, it will set legacy level to l0 segment, which may cause l0 segment be moved to other node, cause search/query failed. Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-08-02 10:18:13 +08:00
wei liu	03912a8788	enhance: Avoid balance stuck after segment list become stable (#34728 ) issue: #34715 if collection's segment list doesn't changes anymore, then the next target will be empty at most time, and balance segment will check whether segment exist in both current and next target, so the balance cloud be blocked due to next target is empty. This PR permit segment to be moved if next target is empty, to avoid balance stuck. Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-07-31 18:09:48 +08:00
wei liu	c45f38aa61	enhance: Update protobuf-go to protobuf-go v2 (#34394 ) issue: #34252 Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-07-29 11:31:51 +08:00
wei liu	9b37d3f517	enhance: Enable setting the replica number and resource group during collection creation (#34403 ) issue: #30040 --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-07-10 10:20:13 +08:00
congqixia	b284b81a47	fix: Check partition in current target when observing partition load status (#34282 ) See also #34234 `LoadPartitions` does not guarantee the current target has loading partitions if there are some partitions already loaded before. This PR check current target contains the partition to load when advancing loading percentage to 100. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-07-01 17:40:07 +08:00
jaime	9630974fbb	enhance: move rocksmq from internal to pkg module (#33881 ) issue: #33956 Signed-off-by: jaime <yun.zhang@zilliz.com>	2024-06-25 21:18:15 +08:00
congqixia	07c25a19d9	fix: Make querycoord panick when rg metastore sync fail (#34106 ) See also #34047 When `unassignNode` sync resource group with node removed failed Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-06-24 21:38:02 +08:00
wei liu	02945959d9	enhance: Avoid to iterate whole segment list for each task's process (#33943 ) when querycoord process segment task, it will try to iterate whole segment list to checke whether segment is loaded, which cost too much cpu if there has thousands of segments. Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-06-19 10:19:58 +08:00
wayblink	a1232fafda	feat: Major compaction (#33620 ) #30633 Signed-off-by: wayblink <anyang.wang@zilliz.com> Co-authored-by: MrPresent-Han <chun.han@zilliz.com>	2024-06-10 21:34:08 +08:00
wei liu	b13932bb55	enhance: Enable database level replica num and resource groups for loading collection (#33052 ) issue: #30040 This PR introduce two database level props: 1. database.replica.number 2. database.resource_groups User can set those two database props by AlterDatabase API, then can load collection without specified replica_num and resource groups. then it will use database level load param when try to load collections. Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-05-29 10:59:43 +08:00
jaime	0d99db23b8	fix: metrics leak on the coord nodes (#33075 ) issue: #32980 Signed-off-by: jaime <yun.zhang@zilliz.com>	2024-05-20 22:03:39 +08:00
wei liu	a7f6193bfc	fix: query node may stuck at stopping progress (#33104 ) issue: #33103 when try to do stopping balance for stopping query node, balancer will try to get node list from replica.GetNodes, then check whether node is stopping, if so, stopping balance will be triggered for this replica. after the replica refactor, replica.GetNodes only return rwNodes, and the stopping node maintains in roNodes, so balancer couldn't find replica which contains stopping node, and stopping balance for replica won't be triggered, then query node will stuck forever due to segment/channel doesn't move out. --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-05-20 10:21:38 +08:00
wei liu	f1c9986974	enhance: Skip return data distribution if no change happen (#32814 ) issue: #32813 --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-05-17 10:11:37 +08:00
cai.zhang	6ea7633bd5	enhance: Add memory size for binlog (#33025 ) issue: #33005 1. add `MemorySize` field for insert binlog. 2. `LogSize` means the file size in the storage object. 3. `MemorySize` means the size of the data in the memory. --------- Signed-off-by: Cai Zhang <cai.zhang@zilliz.com> Signed-off-by: cai.zhang <cai.zhang@zilliz.com>	2024-05-15 12:59:34 +08:00
wei liu	cba2c7a3be	enhance: clean channel node info in meta store (#32988 ) issue: #32910 see also: #32911 when channel exclusive mode is enabled, replica will record channel node info in meta store, and if the balance policy changes, which means channel exclusive mode is disabled, we should clean up the channel node info in meta store, and stop to balance node between channels. Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-05-14 10:05:40 +08:00
chyezh	293f14a8b9	fix: remove redundant replica recover (#32985 ) issue: #22288 - replica recover should be only triggered by replica recover Signed-off-by: chyezh <chyezh@outlook.com>	2024-05-13 15:25:32 +08:00
wei liu	e2332bdc17	enhance: Enable channel exclusive balance policy (#32911 ) issue: #32910 * split replica's node list to channels when create replicas * balance nodes among channels when node change happens * implement channel level balance, let balance happens in channel level Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-05-10 17:27:31 +08:00
wei liu	04a8ec69f6	fix: Segment on stopping query node can't be release successfully (#32929 ) issue: #32901 Cause release segment request need be send to delegator, but it need replica to info find segment's delegator. but the stopping query node will be marked as read only in replica, then `replica.Contains()` just return true for rwNode in replica. then it can't get replica info by stopping query node and release segment will be blocked. This PR make `replica.Contains()` return true for both roNode and rwNode. Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-05-10 14:33:30 +08:00
congqixia	efa58ae423	enhance: Utilize coll2replica mapping when getting rg by collection (#32892 ) See also #32165 In old `GetResourceGroupByCollection` implementation, it iterates all replicas to match collection id, which is slow and CPU time consuming. This PR make it utilize the coll2Replicas mapping by calling `GetByCollection` and mapping replicas into resource group. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com> Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-05-09 19:37:30 +08:00
wei liu	ba02d54a30	enhance: update shard leader cache when leader location changed (#32470 ) issue: #32466 this PR enhance that when shard location changed, update proxy's shard leader cache. in case of query node failover case, proxy can find replica recover --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-05-08 10:05:29 +08:00
yihao.dai	9db3aa18bc	enhance: Remove deprecated EnableIndex (#32704 ) /kind improvement Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-05-07 17:11:30 +08:00
wei liu	c0555d4b45	fix: Remove read only node from replica immedaitely after node down (#32666 ) issue: #32665 Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-04-28 20:25:25 +08:00
congqixia	a239e9110e	enhance: Apply node-indexing and cache optimization for channel dist (#32595 ) See also #32165 --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-04-28 16:19:24 +08:00
Xiaofan	02ace25c68	enhance: reduce the cpu usage when collection number is high (#32245 ) related to #32165 1. for all the manager, support collection level index 2. remove collection level filter to avoid extra cpu usage when collection number increases Signed-off-by: xiaofanluan <xiaofan.luan@zilliz.com>	2024-04-26 11:49:25 +08:00
chyezh	f06509bf97	fix: get replica should not report error when no querynode serve (#32536 ) issue: #30647 - Remove error report if there's no query node serve. It's hard for programer to use it to do resource management. - Change resource group `transferNode` logic to keep compatible with old version sdk. --------- Signed-off-by: chyezh <chyezh@outlook.com>	2024-04-25 19:25:24 +08:00
congqixia	f30c22626e	enhance: Pre-cache result for frequent filters (#32580 ) See also #32165 Add segment dist and leader view filter criterion struct to store frequent filter conditions. Add collection/channel filter results for these two meta --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-04-25 11:13:25 +08:00
congqixia	37ca32dbba	enhance: Make SegmentDistManager filter use node index (#32533 ) See also #32165 Change `SegmentDistFilter` to interface in order to provde node index when filter segment dist. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-04-24 16:53:24 +08:00
congqixia	bfebdecf3e	enhance: Make LeaderView Manager filter use map index (#32505 ) See also #32165 Change `LeaderViewFilter` to interface to provided map key to avoid iterating all key-values in LeaderViewManager Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-04-23 11:07:24 +08:00
congqixia	01c16fe6e3	enhance: Manual release pool after save targets (#32358 ) See also #31632 Release conc.Pool after usage to clean worker and stop background purge and ticktock. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-04-19 13:51:21 +08:00
chyezh	a8c8a6bb0f	fix: parameter check of TransferReplica and TransferNode (#32297 ) issue: #30647 - Same dst and src resource group should not be allowed in `TransferReplica` and `TransferNode`. - Remove redundant parameter check. Signed-off-by: chyezh <chyezh@outlook.com>	2024-04-17 15:27:19 +08:00
yiwangdr	7deda4d5e9	enhance: speed up GetByCollectionAndNode (#32232 ) Related to https://github.com/milvus-io/milvus/issues/32165 Avoid iterating through all replicas/collections if possible. Iteration is expensive when there are large number of replicas/collections. Signed-off-by: yiwangdr <yiwangdr@gmail.com>	2024-04-17 10:23:25 +08:00
congqixia	dc11cbd123	enhance: Maintain collection-patitions mapping in qc meta (#32227 ) Related to #32165 Add collection to partitionIDs mapping to avoid interation on all partitions loaded when trying to get all partitions with collection id --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-04-15 10:05:19 +08:00
chyezh	48fe977a9d	enhance: declarative resource group api (#31930 ) issue: #30647 - Add declarative resource group api - Add config for resource group management - Resource group recovery enhancement --------- Signed-off-by: chyezh <chyezh@outlook.com>	2024-04-15 08:13:19 +08:00
congqixia	b9a487608a	fix: Make `ResourceGroup.nodes` concurrent safe (#32159 ) See also #32158 Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-04-11 17:53:18 +08:00
chyezh	a3d6110957	fix: ut failure (#32120 ) issue: #30647 Signed-off-by: chyezh <chyezh@outlook.com>	2024-04-10 17:30:48 +08:00
chyezh	0be67e7f99	fix: ut failure (#32119 ) issue: #30647 Signed-off-by: chyezh <chyezh@outlook.com>	2024-04-10 17:23:27 +08:00
wei liu	c4806b69c4	enhance: Refactor leader view manager interface (#31133 ) issue: #31091 This PR add GetByFilter interface in leader view manager, instead of all kind of get func --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-04-10 15:13:36 +08:00
chyezh	a2502bde75	enhance: replica manager enhancement (#31496 ) issue: #30647 - ReplicaManager manage read only node now, and always do persistent of node distribution of replica. - All segment/channel checker using ReplicaManager to get read-only node or read-write node, but not ResourceManager. - ReplicaManager promise that only apply unique querynode to one replica in same collection now (replicas in same collection never hold same querynode at same time). - ReplicaManager promise that fairly node count assignment policy if multi replicas of collection is assigned to one resource group. - Move some parameters check into ReplicaManager to avoid data race. - Allow transfer replica to resource group that already load replica of same collection - Allow transfer node between resource groups that load replica of same collection --------- Signed-off-by: chyezh <chyezh@outlook.com>	2024-04-05 04:57:16 +08:00
congqixia	56e371c478	fix: Check replica exists before get latest leader (#31848 ) See also #31847 Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-04-03 10:05:22 +08:00

1 2 3 4

188 Commits