milvus

mirror of https://gitee.com/milvus-io/milvus.git synced 2024-12-02 11:59:00 +08:00

Author	SHA1	Message	Date
congqixia	b111f3b110	enhance: Use RWMutex and change WLock to RLock (#30557 ) Related to #27675 Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-02-06 17:13:56 +08:00
congqixia	d4100d5442	enhance: Change update channel cp magic number to param item (#30555 ) See also #28817 Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-02-06 16:02:00 +08:00
congqixia	a68b32134a	fix: Verify sync task target segment and retry if not match (#30500 ) See also #27675 #30469 For a sync task, the segment could be compacted during sync task. In previous implementation, this sync task will hold only the old segment id as KeyLock, in which case compaction on compacted to segment may run in parallel with delta sync of this sync task. This PR introduces sync target segment verification logic. It shall check target segment lock it's holding beforing actually syncing logic. If this check failed, sync task shall return`errTargetSegementNotMatch` error and make manager re-fetch the current target segment id. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-02-05 11:33:43 +08:00
yihao.dai	18b979d9b4	enhance: Extend support for varchar autoID to BulkInsertV2 (#30477 ) issue: https://github.com/milvus-io/milvus/issues/30476 Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-02-04 16:57:05 +08:00
XuanYang-cn	e6eb6f2c78	enhance: Speed up L0 compaction (#30410 ) This PR changes the following to speed up L0 compaction and prevent OOM: 1. Lower deltabuf limit to 16MB by default, so that each L0 segment would be 4X smaller than before. 2. Add BatchProcess, use it if memory is sufficient 3. Iterator will Deserialize when called HasNext to avoid massive memory peek 4. Add tracing in spiltDelta See also: #30191 --------- Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2024-02-04 10:49:05 +08:00
yihao.dai	7ce876a072	fix: Decoupling importing segment from flush process (#30402 ) This pr decoups importing segment from flush process by: 1. Exclude the importing segment from the flush policy, this approch avoids notifying the datanode to flush the importing segment, which may not exist. 2. When RootCoord call Flush, DataCoord directly set the importing segment state to `Flushed`. issue: https://github.com/milvus-io/milvus/issues/30359 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-02-03 13:01:12 +08:00
congqixia	1ab851d73f	enhance: Remove useless frequent log in Mintimestamp (#30471 ) Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-02-02 20:39:05 +08:00
XuanYang-cn	d744962aa1	fix: Correct Size calculation of DeleteData (#30397 ) This PR would correct the actual deltalog size See also: #30191 --------- Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2024-02-02 10:47:04 +08:00
XuanYang-cn	fb5e09d94d	fix: call injectDone after compaction failed (#30277 ) syncMgr.Block() will lock the segment when executing compaction. Previous implementation was unable to Unblock thoese segments when compaction failed. If next compaction of the same segments arrives, it'll stuck forever and block all later compation tasks. This PR makes sure compaction executor would Unblock these segments after a failure compaction. Apart form that, this PR also refines some logs and clean some codes of compaction, compactor: 1. Log segment count instead of segmentIDs to avoid logging too many segments 2. Flush RPC returns L1 segments only, skip L0 and L2 3. CompactionType is checked in `Compaction`, no need to check again inside compactor 4. Use ligter method to replace `getSegmentMeta` 5. Log information for L0 compaction when encounters an error See also: #30213 --------- Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2024-02-01 14:25:04 +08:00
congqixia	be8831b311	enhance: Reduce get segments scan during l0 compaction (#30408 ) See also #27606 Previously l0 linear compaction will scan all target segment id from metacache for each line of delta entry, which is not needed since compaction target segments shall be all immutable. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-02-01 10:59:03 +08:00
yihao.dai	c5918290e6	feat: Add import executor and manager for datanode (#29438 ) This PR introduces novel importv2 roles for datanode: 1. Executor: To execute tasks, a import task will be divided into the following steps: read data -> hash data -> sync data; 2. Manager: To manage all the tasks; issue: https://github.com/milvus-io/milvus/issues/28521 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-01-31 20:45:04 +08:00
congqixia	fc0d007bd1	enhance: Add `MemoryHighSyncPolicy` back to write buffer manager (#29997 ) See also #27675 This PR adds back MemoryHighSyncPolicy implementation. Also change MinSegmentSize & CheckInterval to configurable param item. --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-01-31 19:03:04 +08:00
congqixia	b5e078c4d3	enhance: Remove current stats after RollStats action (#30391 ) See also #27675 BloomFilterSet.current shall be reset after RollStats, otherwise it will keep tracking whole segment data causing the false positive ratio larger than expected. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-01-31 18:55:04 +08:00
chyezh	6d63fb5d3f	fix: panic with datanode negetive wait group counter (#30135 ) issue: #29170 Signed-off-by: chyezh <chyezh@outlook.com>	2024-01-30 18:15:04 +08:00
congqixia	0c7a96b48d	enhance: Make compaction log has traceID (#30338 ) See also #30167 After support open telemetry tracing, we want to have traceID as well, this PR adds util functions to set traceID with span & propagate traceID between different context. --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-01-30 10:09:03 +08:00
congqixia	743bdf1434	enhance: Make l0 compactor download files in parallel (#30309 ) See also #27606 `MultiRead` actually download file in sequence, which may lead to large time consumption during l0 compaction download phase. This PR make l0 compactor download deltalogs in parallel utilizing conc package & io pool. --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-01-30 10:07:09 +08:00
congqixia	6445880753	fix: prevent segments got flushed multiple times (#30240 ) See also #30111 Segments could be "Flushed" only by `FlushSegments` grpc call from datacoord by design. There are two possible reason to cause one segment got flushed multiple times. - Segment is in flushing state during multiple epoch in flowgraph - Segment is flushed by flushTs & Flush segments So this pr fix: - Remove state change logic form FlushTs policy - Change Flush segment into three stage way: Sealed->Flushing->Flushed preventing multiple Flushed=true operations. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-01-24 14:19:00 +08:00
congqixia	6a73860815	enhance: Add open telemetry tracing for compaction (#30168 ) Resolves #30167 This PR add tracing for all compaction from the task start in datacoord and execution procedures in datanode. --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-01-23 10:37:00 +08:00
congqixia	8a6de3d2b1	fix: decompress deltelog path for level zero compaction (#30164 ) Resolves: #30161 See also: #28873 --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-01-22 14:44:55 +08:00
yihao.dai	8780d65b66	fix: Use channel cp as the dml&start position for import segments (#30107 ) This PR discontinuing the subscription to the mq and, instead, employing the channel checkpoint as the DML and starting position for the import segments. issue: https://github.com/milvus-io/milvus/issues/30106 Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-01-22 14:36:55 +08:00
XuanYang-cn	3d46096f86	fix: Set segment level for comapct to segment (#30129 ) See also: #29204 Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2024-01-19 18:52:53 +08:00
congqixia	0d356b0545	enhance: only buffer delete if it match insert has smaller timestamp (#30122 ) See also: #30121 #27675 This PR changes the delete buffering logic: - Write buffer shall buffer insert first - Then the delete messages shall be evaluated - Whether PK matches previous Bloom filter, which ts is always smaller - Whether PK matches insert data which has smaller timestamp - Then the segment bloom filter is updates by the newly buffered pk rows --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-01-19 17:28:53 +08:00
XuanYang-cn	86f48861c1	fix: Add more throughput in related metrics (#30038 ) This PR also fixes bugs in l0 compactor where l0 results would never be removed from datanode See also: #30099 Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2024-01-19 11:34:54 +08:00
smellthemoon	e52ce370b6	enhance:don't store logPath in meta to reduce memory (#28873 ) don't store logPath in meta to reduce memory, when service get segmentinfo, generate logpath from logid. #28885 Signed-off-by: lixinguo <xinguo.li@zilliz.com> Co-authored-by: lixinguo <xinguo.li@zilliz.com>	2024-01-18 22:06:31 +08:00
XuanYang-cn	ad7a0b4091	fix: Change finish log level to info (#30031 ) Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2024-01-17 10:12:55 +08:00
SimFG	d9edd50f97	fix: the delete msg disorder issue (#29915 ) /kind improvement Signed-off-by: SimFG <bang.fu@zilliz.com>	2024-01-14 10:26:52 +08:00
congqixia	ed89c6a2ee	enhance: make compactor use actual buffer size to decide when to sync (#29945 ) See also: #29657 Datanode Compactor use estimated row number from schema to decide when to sync the batch of data when executing compaction. This est value could go way from actual size when the schema contains variable field( say VarChar, JSON, etc.) This PR make compactor able to check the actual buffer data size and make it possible to sync when buffer is actually beyong max binglog size. --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-01-13 01:32:52 +08:00
Bingyi Sun	e1258b8cad	feat: integrate storagev2 into loading segment (#29336 ) issue: #29335 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2024-01-12 18:10:51 +08:00
Xu Tong	e429965f32	Add float16 approve for multi-type part (#28427 ) issue：https://github.com/milvus-io/milvus/issues/22837 Add bfloat16 vector, add the index part of float16 vector. Signed-off-by: Writer-X <1256866856@qq.com>	2024-01-11 15:48:51 +08:00
XuanYang-cn	9c8fd5e51d	fix: Save lite WatchInfo into etcd in DataNode (#29687 ) Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2024-01-10 21:18:49 +08:00
congqixia	a040692129	enhance: Use estimated batch size to initalize BF (#29842 ) See also: #27675 The bloom filter set initialized new BF with fixed configured `n`. This value is always larger than the actual batch size and causes generated BF using more memory. This PR make write buffer to initialize BF with estimated batch size from schema & configuration value. --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-01-10 20:36:50 +08:00
Buqian Zheng	d506d33a8d	fix: meta cache in datanode incorrectly tracking row nums (#29817 ) ... of compacted segments issue: #29816 Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2024-01-10 13:22:48 +08:00
congqixia	f18a7191f2	enhance: make `ColumnBasedInsertMsgToInsertData` check field missing (#29758 ) fix: #29757 In previous code, `ColumnBasedInsertMsgToInsertData` adds empty field if the insertMsg parameter does not have the column schema defined. This may lead to unexpected behavior of caller functions. This PR: - Add column missing check - Add column length check - Generate BlobInfo for ColumnBasedInsertMsgToInsertData result --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-01-09 11:50:48 +08:00
smellthemoon	1c1f2a1371	enhance:change some logs (#29579 ) related #29588 Signed-off-by: lixinguo <xinguo.li@zilliz.com> Co-authored-by: lixinguo <xinguo.li@zilliz.com>	2024-01-05 16:12:48 +08:00
congqixia	dc6a6a50fa	enhance: reduce SyncTask AllocID call and refine code (#29701 ) See also #27675 `Allocator.Alloc` and `Allocator.AllocOne` might be invoked multiple times if there were multiple blobs set in one sync task. This PR add pre-fetch logic for all blobs and cache logIDs in sync task so that at most only one call of the allocator is needed. --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-01-05 10:04:46 +08:00
XuanYang-cn	a3aff37f73	fix: Correct flush buffer size metrics (#29571 ) See also: #29204 Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2024-01-04 17:22:46 +08:00
congqixia	79c06c5e73	fix: serializer shall bypass L0 segment merge stats step (#29636 ) See also #27675 Fix logic problem introduced by #29413, which is serializer tries to merge statslog list while level segments do not have statslog. This shall result returning error. `writeBufferBase` ignores this error but it shall only ignore `ErrSegmentNotFound`. This PR add logic checking segment level before execution of merging statslog list. And add error type check for getSyncTask failure. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-01-04 16:52:45 +08:00
congqixia	55af8f611f	fix: always sync level zero segments as flushed (#29569 ) See also #27675 For now, Level zero segments shall always be synced as `Flushed` ones. This PR fixes when level zero segments selected by policies other than flush ts policy will be synced as growing state. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-12-29 10:34:47 +08:00
MrPresent-Han	ed644983e2	enhance: add param for bloomfilter(#29388 ) (#29490 ) related: #29388 Signed-off-by: MrPresent-Han <chun.han@zilliz.com>	2023-12-28 18:10:46 +08:00
yah01	a8a0aa9357	fix: missing to support compact for Array type (#29505 ) the array type can't be compacted, the system could continue with the inserted segments, but these segments can be never compacted fix #29503 --------- Signed-off-by: yah01 <yah2er0ne@outlook.com>	2023-12-28 15:42:48 +08:00
XuanYang-cn	632d8b3743	enhance: Change DN channelmanger into interface (#29307 ) See also: #28854 Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2023-12-27 16:00:48 +08:00
XuanYang-cn	fe04598900	enhance: Add compaction type label to metrics (#29485 ) Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2023-12-27 15:56:48 +08:00
congqixia	f6cff25712	enhance: fix serialization record span & flushed buffer size metrics (#29482 ) See also #27675 #29413 --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-12-27 10:20:48 +08:00
congqixia	277849a915	enhance: separate serializer logic from sync task (#29413 ) See also #27675 Since serialization segment buffer does not related to sync manager can shall be done before submit into sync manager. So that the pk statistic file could be more accurate and reduce complex logic inside sync manager. --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-12-26 10:40:47 +08:00
congqixia	a937e4c232	fix: segment may never get flushed if sealed before watch (#29436 ) See also #29092 `FlushSegments` transfer only `Growing` segment to flushing, if the segment is in `Sealed` state before Datanode watch channel, the state will never got satisfied for a segment be selected to be flushed. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-12-23 21:32:43 +08:00
SimFG	dd9c61831d	enhance: Support to get the param value in the runtime (#29297 ) /kind improvement issue: #29299 Signed-off-by: SimFG <bang.fu@zilliz.com>	2023-12-22 18:36:44 +08:00
XuanYang-cn	7a6aa8552a	fix: add back existing datanode metrics (#29360 ) See also: #29204 --------- Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2023-12-22 14:20:43 +08:00
Xiaofan	77b291c5dc	fix: Add jitter in GetSyncStaleBufferPolicy (#28626 ) related to #28427 Add a jitter in syncStatleBuffer policy so all segments won't flush at the same time Signed-off-by: xiaofanluan <xiaofan.luan@zilliz.com>	2023-12-20 15:00:52 +08:00
congqixia	1ee016709d	fix: Unstable `TestDataSyncService/TestStartStop` unit test (#29291 ) fix #29290 Change EXPECT `NotifyCheckpointUpdated` call to `Maybe` expectation Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-12-19 23:16:42 +08:00
XuanYang-cn	5164377e68	fix: Skip updating checkpoint after dropcollection (#29220 ) Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2023-12-15 16:04:45 +08:00

1 2 3 4 5 ...

924 Commits