milvus

mirror of https://gitee.com/milvus-io/milvus.git synced 2024-12-10 16:32:06 +08:00

Author	SHA1	Message	Date
XuanYang-cn	68c9e7db8c	fix: Sync dropped segment for dropped partition (#33331 ) See also: #33330 --------- Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2024-06-06 10:25:52 +08:00
aoiasd	387b7cd7f4	enhance:avoid maintain checkpoint info in sync manager (#33413 ) relate: https://github.com/milvus-io/milvus/issues/32915 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2024-06-05 10:05:50 +08:00
wei liu	c6a1c49e02	enhance: Use Blocked Bloom Filter instead of basic bloom fitler impl. (#33405 ) issue: #32995 To speed up the construction and querying of Bloom filters, we chose a blocked Bloom filter instead of a basic Bloom filter implementation. WARN: This PR is compatible with old version bf impl, but if fall back to old milvus version, it may causes bloom filter deserialize failed. In single Bloom filter test cases with a capacity of 1,000,000 and a false positive rate (FPR) of 0.001, the blocked Bloom filter is 5 times faster than the basic Bloom filter in both querying and construction, at the cost of a 30% increase in memory usage. - Block BF construct time {"time": "54.128131ms"} - Block BF size {"size": 3021578} - Block BF Test cost {"time": "55.407352ms"} - Basic BF construct time {"time": "210.262183ms"} - Basic BF size {"size": 2396308} - Basic BF Test cost {"time": "192.596229ms"} In multi Bloom filter test cases with a capacity of 100,000, an FPR of 0.001, and 100 Bloom filters, we reuse the primary key locations for all Bloom filters to avoid repeated hash computations. As a result, the blocked Bloom filter is also 5 times faster than the basic Bloom filter in querying. - Block BF TestLocation cost {"time": "529.97183ms"} - Basic BF TestLocation cost {"time": "3.197430181s"} --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-05-31 17:49:45 +08:00
cai.zhang	77637180fa	enhance: Periodically synchronize segments to datanode watcher (#33420 ) issue: #32809 --------- Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2024-05-30 13:37:44 +08:00
congqixia	e71b7c7cc9	enhance: Reduce datanode metacache frequent scan range (#33400 ) See also #32165 There were some frequent scan in metacache: - List all segments whose start positions not synced - List compacted segments Those scan shall cause lots of CPU time when flushed segment number is large meanwhile `Flushed` segments can be skipped in those two scenarios This PR make: - Add segment state shortcut in metacache - List start positions state before `Flushed` - Make compacted segments state to be `Dropped` and use `Dropped` state while scanning them --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-05-28 14:19:42 +08:00
yihao.dai	7730b910b9	enhance: Decouple compaction from shard (#33138 ) Decouple compaction from shard, remove dependencies on shards (e.g. SyncSegments, injection). issue: https://github.com/milvus-io/milvus/issues/32809 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-05-24 09:07:41 +08:00
congqixia	5452376e90	fix: Remove task from syncmgr after task done (#33302 ) See also #33247 Introduced in PR #32865 Remove task after task done to keep checkpoint sound and safe Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-05-23 14:33:40 +08:00
cai.zhang	6ea7633bd5	enhance: Add memory size for binlog (#33025 ) issue: #33005 1. add `MemorySize` field for insert binlog. 2. `LogSize` means the file size in the storage object. 3. `MemorySize` means the size of the data in the memory. --------- Signed-off-by: Cai Zhang <cai.zhang@zilliz.com> Signed-off-by: cai.zhang <cai.zhang@zilliz.com>	2024-05-15 12:59:34 +08:00
congqixia	77fa615772	fix: Make SyncManager callback func ignore nil error (#32891 ) introduced by #32865 sync manager callback handler panicked when error is nil Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-05-09 18:03:31 +08:00
congqixia	a06f601c6e	fix: Make syncmgr lock key before returning future (#32865 ) See also #32860 SyncMgr did not ensure task key is locked before `SyncData` returning which may cause concurrent problem during sync wich multiple policies. This PR change sync mgr implementation to make sure the key is locked before returning task result `*conc.Future` --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-05-09 10:09:30 +08:00
congqixia	2c1e8f4774	enhance: Use `struct{}` for sync task future result (#32673 ) Related to #27675 Use `struct{}` instead `error` for sync task future result type to reduce result size and preventing logci error. Also change some unused parameter to `_` to suppress lint warning Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-04-29 10:59:26 +08:00
yihao.dai	d6cdcf74db	fix: Return err for conc.Future in sync manager (#31790 ) Should not return `err, nil` when using conc.Future, as the error will be lost/ignored when using `AwaitAll` to wait for the future. issue: https://github.com/milvus-io/milvus/issues/31788 Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-04-06 11:36:57 -07:00
yihao.dai	78fbb87b3a	enhance: Release blobs in sync task once sync is completed (#31661 ) Once the synchronization of the sync task is completed, it's necessary to release the blob within the sync task, as the caller may continue to reference it. issue: https://github.com/milvus-io/milvus/issues/31545 Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-03-28 10:23:11 +08:00
congqixia	937f2440ab	fix: TestBlock case use different segment id in testcase (#31173 ) Resolves: #31172 Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-03-11 17:51:03 +08:00
yiwangdr	c6665c2a4c	test: support multiple data/querynodes in integration test (#30618 ) issue: https://github.com/milvus-io/milvus/issues/29507 Signed-off-by: yiwangdr <yiwangdr@gmail.com>	2024-02-21 11:54:53 +08:00
congqixia	a68b32134a	fix: Verify sync task target segment and retry if not match (#30500 ) See also #27675 #30469 For a sync task, the segment could be compacted during sync task. In previous implementation, this sync task will hold only the old segment id as KeyLock, in which case compaction on compacted to segment may run in parallel with delta sync of this sync task. This PR introduces sync target segment verification logic. It shall check target segment lock it's holding beforing actually syncing logic. If this check failed, sync task shall return`errTargetSegementNotMatch` error and make manager re-fetch the current target segment id. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-02-05 11:33:43 +08:00
yihao.dai	c5918290e6	feat: Add import executor and manager for datanode (#29438 ) This PR introduces novel importv2 roles for datanode: 1. Executor: To execute tasks, a import task will be divided into the following steps: read data -> hash data -> sync data; 2. Manager: To manage all the tasks; issue: https://github.com/milvus-io/milvus/issues/28521 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-01-31 20:45:04 +08:00
XuanYang-cn	3d46096f86	fix: Set segment level for comapct to segment (#30129 ) See also: #29204 Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2024-01-19 18:52:53 +08:00
Bingyi Sun	e1258b8cad	feat: integrate storagev2 into loading segment (#29336 ) issue: #29335 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2024-01-12 18:10:51 +08:00
congqixia	dc6a6a50fa	enhance: reduce SyncTask AllocID call and refine code (#29701 ) See also #27675 `Allocator.Alloc` and `Allocator.AllocOne` might be invoked multiple times if there were multiple blobs set in one sync task. This PR add pre-fetch logic for all blobs and cache logIDs in sync task so that at most only one call of the allocator is needed. --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-01-05 10:04:46 +08:00
congqixia	79c06c5e73	fix: serializer shall bypass L0 segment merge stats step (#29636 ) See also #27675 Fix logic problem introduced by #29413, which is serializer tries to merge statslog list while level segments do not have statslog. This shall result returning error. `writeBufferBase` ignores this error but it shall only ignore `ErrSegmentNotFound`. This PR add logic checking segment level before execution of merging statslog list. And add error type check for getSyncTask failure. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-01-04 16:52:45 +08:00
congqixia	f6cff25712	enhance: fix serialization record span & flushed buffer size metrics (#29482 ) See also #27675 #29413 --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-12-27 10:20:48 +08:00
congqixia	277849a915	enhance: separate serializer logic from sync task (#29413 ) See also #27675 Since serialization segment buffer does not related to sync manager can shall be done before submit into sync manager. So that the pk statistic file could be more accurate and reduce complex logic inside sync manager. --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-12-26 10:40:47 +08:00
XuanYang-cn	7a6aa8552a	fix: add back existing datanode metrics (#29360 ) See also: #29204 --------- Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2023-12-22 14:20:43 +08:00
congqixia	4731c1b0d5	enhance: make SyncManager pool size refreshable (#29224 ) See also #29223 This PR make `conc.Pool` resizable by adding `Resize` method for it. Also make newly added datanode `MaxParallelSyncMgrTasks` config refreshable --------- Signed-off-by: Congqi.Xia <congqi.xia@zilliz.com>	2023-12-15 09:58:43 +08:00
Bingyi Sun	ad866d2889	feat: integrate storagev2 into index build process (#28995 ) issue: https://github.com/milvus-io/milvus/issues/28994 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2023-12-13 17:24:38 +08:00
congqixia	cb43647b9e	enhance: Log channel checkpoint source info in writebuffer (#28993 ) See also #27675 Print channel checkpoint source with rated log will help debugging system behavior Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-12-07 11:50:36 +08:00
congqixia	cb31016640	enhance: Write buffer time range when syncing logs (#28970 ) Related to #27675 The timestamp from, to field is not field for new implementation of writebuffer & sync manager This pr fills these field for better log information Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-12-05 17:36:36 +08:00
Bingyi Sun	36f69ea031	feat: integrate storagev2 in building index of segcore (#28768 ) issue: https://github.com/milvus-io/milvus/issues/28655 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2023-12-05 16:48:54 +08:00
XuanYang-cn	5d0a9f9344	fix: Forget to set EntriesNum for deltalogs (#28858 ) See also: #28520 Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2023-11-30 16:52:31 +08:00
congqixia	2cd8daaf0b	fix: compacted segment still buffers delta data (#28816 ) Related to #28628 Compacted segment syncing counter is not set correctly in sync task and the bf write buffer shall not use compacted segment as candidate when buffering delta data --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-11-30 10:20:28 +08:00
congqixia	8a9ab69369	fix: Skip statslog generation flushing empty L0 segment (#28733 ) See also #27675 When L0 segment contains only delta data, merged statslog shall be skiped when performing sync task --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-11-25 15:10:25 +08:00
congqixia	39be35804c	enhance: Add back clean compacted segment info logic (#28646 ) See also #27675 Compacted segment info shall be removed after all buffer belongs to it is sync-ed. This PR add the cleanup function after triggerSyncTask logic: - The buffer is stable and protected by mutex - Cleanup fetches compacted & non-sync segment - Remove segment info only there is no buffered maintained in manager --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-11-24 15:38:25 +08:00
smellthemoon	79c0edb1d8	enhance:Remove msgbase unnecessary assignments (#28511 ) remove some unnecessary assignments, for the reason that commonpbutil.NewMsgBase has default value. Signed-off-by: lixinguo <xinguo.li@zilliz.com> Co-authored-by: lixinguo <xinguo.li@zilliz.com>	2023-11-24 15:02:39 +08:00
XuanYang-cn	b1f15fa0e8	fix: Ease the log level when sync task done (#28678 ) Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2023-11-23 19:18:28 +08:00
Bingyi Sun	4fedff6d47	feat: integrate storage v2 into the write path (#28440 ) #28378 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2023-11-23 17:26:24 +08:00
congqixia	2fc743992a	fix: syncmgr unstable TestCompact unittest logic (#28630 ) fix #28629 orignal unit test close channel before setting the segment id, so there is a chance that test read segment id before setting it change unit test behavior to wait future return now Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-11-22 00:52:23 +08:00
congqixia	2b3fa8f67b	fix: Add length check for `storage.NewPrimaryKeyStats` (#28576 ) See also #28575 Add zero-length check for `storage.NewPrimaryKeyStats`. This function shall return error when non-positive rowNum passed. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-11-21 10:28:21 +08:00
congqixia	18dc6b61ce	enhance: fix LevelZero segment sync logic (#28482 ) See also #27675 - Fix LevelZero segment cannot be flushed - Add level option for syncTask - Invoke `AddSegment` when new LevelZero segment is allocated Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-11-17 21:46:20 +08:00
congqixia	a3cd0bc9c3	fix: Refine sync task field binlog compose logic (#28494 ) See also #27675 Since `MetaWriter` need `*datapb.FieldBinlog` struct, sync task now generate FieldBinlog directly Also fix merged statslog not generated if last task has no insert Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-11-17 14:40:26 +08:00
congqixia	bed7467f20	enhance: Remove commented code and fix naming issue (#28450 ) This PR removes all the commented code and files from PR #28320 For naming issue: - Renaming `MinCheckpoint` to `EarliestPosition`, see #28320 comment - Renaming `writebuffer.Mananger` to `BufferMananger`, see #27874 comment Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-11-16 00:22:20 +08:00
congqixia	0b905078e7	Use writebuffer, sync manager refactory in datanode (#28320 ) See also #27675 This PR make previously merged refactory of datanode go online - Use write node to replace insert/delete node - Use write buffer manager to control all buffers - Use sync manager to control sync tasks instead of flush manager Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-11-15 15:24:18 +08:00
congqixia	af1c2044b9	Fix atomic.Int64 not found in go 1.18 (#28216 ) Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-11-07 15:16:21 +08:00
congqixia	bf2f62c1e7	Add `WriteBuffer` to provide abstraction for delta policy (#27874 ) Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-11-04 12:10:17 +08:00
congqixia	1e51255c15	Implement `Injection` for SyncManager with block and meta transition (#28093 ) Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-11-03 04:48:15 +08:00
congqixia	233bf90c55	Add SyncManager to replace flush manager (#27873 ) Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-10-31 02:30:16 +08:00

46 Commits