Commit Graph

96 Commits

Author SHA1 Message Date
wayblink
a26e965e6a
enhance:[cherry-pick] Add compaction task slot usage logic (#34625)
issue: #34544
pr: #34581

---------

Signed-off-by: wayblink <anyang.wang@zilliz.com>
2024-07-18 09:55:43 +08:00
cai.zhang
0c01ace0d2
fix: [cherry-pick] Only load or release Flushed segment in datanode meta (#34393)
issue: #34376 ,  #34375,  #34379

master pr: #34390

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-07-03 17:44:11 +08:00
wayblink
99586066f5
feat: [cherry-pick] Major compaction (#34326)
This PR cherry-picks the following commits:
fix: speed up segment lookup via channel name in datacoord (#33530)
needed by the next commit
  feat: Major compaction (#33620)

issue: #30633
pr: #33620

---------

Signed-off-by: yiwangdr <yiwangdr@gmail.com>
Signed-off-by: wayblink <anyang.wang@zilliz.com>
Co-authored-by: yiwangdr <80064917+yiwangdr@users.noreply.github.com>
Co-authored-by: MrPresent-Han <chun.han@zilliz.com>
2024-07-02 18:29:01 +08:00
cai.zhang
f11e421839
enhance: [cherry-pick] Remove compaction plans on the datanode (#33548) (#34312)
issue: #33546

master pr: #33548

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-07-02 17:39:08 +08:00
yihao.dai
1b95ee7ae8
enhance: [cherry-pick] Batch pick PRs related to compaction (#34315)
This PR cherry-picks the following commits related to compaction:

- Use a pool for CompactionExecutor.
https://github.com/milvus-io/milvus/pull/33558
- Move compaction executor to compaction pacakge.
https://github.com/milvus-io/milvus/pull/33778
- Ensure the idempotency of compaction tasks.
https://github.com/milvus-io/milvus/pull/33872
- Add comment for channel cp updater.
https://github.com/milvus-io/milvus/pull/33759

issue: https://github.com/milvus-io/milvus/issues/33182,
https://github.com/milvus-io/milvus/issues/32451

pr: https://github.com/milvus-io/milvus/pull/33558,
https://github.com/milvus-io/milvus/pull/33778,
https://github.com/milvus-io/milvus/pull/33872,
https://github.com/milvus-io/milvus/pull/33759

---------

Signed-off-by: coldWater <254244460@qq.com>
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
Co-authored-by: coldWater <254244460@qq.com>
2024-07-02 09:54:07 +08:00
zhenshan.cao
14a11e379c
enhance: Refactor Compaction to enable persistence(#33265) (#34268)
pr : #33265 

issue #33586

Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2024-07-01 19:32:07 +08:00
cai.zhang
1c6e850f73
enhance: [cherry-pick] Periodically synchronize segments to datanode watcher (#33420) (#34186)
This PR primary picks up the SyncSegments functionality, including the
following commits:
- main functionality: https://github.com/milvus-io/milvus/pull/33420
- related fixes:
  - https://github.com/milvus-io/milvus/pull/33664
  - https://github.com/milvus-io/milvus/pull/33829
  - https://github.com/milvus-io/milvus/pull/34056
  - https://github.com/milvus-io/milvus/pull/34156

issue: #32809 
master pr: #33420, #33664, #33829, #34056, #34156

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-06-27 11:24:05 +08:00
yihao.dai
b1e74dc7cb
enhance: [cherry-pick] Decouple compaction from shard (#34157)
This PR cherry-picks the following commits:

- Implement task limit control logic in datanode.
https://github.com/milvus-io/milvus/pull/32881
- Load bf from storage instead of memory during L0 compaction.
https://github.com/milvus-io/milvus/pull/32913
- Remove dependencies on shards (e.g. SyncSegments, injection).
https://github.com/milvus-io/milvus/pull/33138
- Rename Compaction interface to CompactionV2.
https://github.com/milvus-io/milvus/pull/33858
- Remove the unused residual compaction logic.
https://github.com/milvus-io/milvus/pull/33932

issue: https://github.com/milvus-io/milvus/issues/32809

pr: https://github.com/milvus-io/milvus/pull/32881,
https://github.com/milvus-io/milvus/pull/32913,
https://github.com/milvus-io/milvus/pull/33138,
https://github.com/milvus-io/milvus/pull/33858,
https://github.com/milvus-io/milvus/pull/33932

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-06-25 20:22:03 +08:00
XuanYang-cn
a33b68678d
enhance: [cherry-pick] Move compactor into sub package (#34098)
This PR consists of the following commits:

- enhance: Tidy compactor and remove dup codes (#32198)
- fix: Fix l0 compactor may cause DN from OOM (#33554)
- enhance: Add deltaRowCount in l0 compaction (#33997)
- enhance: enable stream writer in compactions (#32612)
- fix: turn on compression on stream writers (#34067)
- fix: adding blob memory size in binlog serde (#33324)

See also: #32451, #33547, #33998, #31679
pr: #32198, #33554, #33997, #32612

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
Signed-off-by: Ted Xu <ted.xu@zilliz.com>
Co-authored-by: Ted Xu <ted.xu@zilliz.com>
2024-06-25 11:16:02 +08:00
yihao.dai
ed1dee9e38
enhance: Support L0 import (#33514) (#33712)
issue: https://github.com/milvus-io/milvus/issues/33157

pr: https://github.com/milvus-io/milvus/pull/33514

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-06-08 11:17:52 +08:00
yihao.dai
e282e1408e
enhance: Abstract Execute interface for import/preimport task (#33234) (#33607)
Abstract Execute interface for import/preimport task, simplify import
scheduler.

issue: https://github.com/milvus-io/milvus/issues/33157

pr: https://github.com/milvus-io/milvus/pull/33234

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-06-05 11:17:56 +08:00
yiwangdr
b1eacb2ae8
feat: datacoord/node watch based on rpc (#32036)
issue: https://github.com/milvus-io/milvus/issues/25309

Signed-off-by: yiwangdr <yiwangdr@gmail.com>
2024-05-07 15:49:30 +08:00
XuanYang-cn
15b989bb80
fix: Zero flushReq metric for all sealed segs (#32404)
See also: #32399

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-04-24 14:27:23 +08:00
XuanYang-cn
aad3ed3835
fix: [cherry-pick]Skip changing meta if nodeID not match with channel (#31672)
See also: #31648
pr: #31665, #31694

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-04-10 15:09:18 +08:00
yihao.dai
4e264003bf
enhance: Ensure ImportV2 waits for the index to be built and refine some logic (#31629)
Feature Introduced:
1. Ensure ImportV2 waits for the index to be built

Enhancements Introduced:
1. Utilization of local time for timeout ts instead of allocating ts
from rootcoord.
3. Enhanced input file length check for binlog import.
4. Removal of duplicated manager in datanode.
5. Renaming of executor to scheduler in datanode.
6. Utilization of a thread pool in the scheduler in datanode.

issue: https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-04-01 20:09:13 +08:00
XuanYang-cn
39337e09b8
fix: Using zero serverID for metrics (#31518)
Fixes: #31516

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-04-01 16:55:19 +08:00
yihao.dai
0fe5e90e8b
enhance: Remove import v1 (#31403)
Remove all code and logic related to import v1.

issue: https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-03-22 15:29:09 +08:00
Xiaofan
a63b4cedcf
fix: remove some unnecessary unrecoverable errors (#31327)
use retry.handle when request is not able to service but don't throw
unrecoverable erros
fix #31323

Signed-off-by: xiaofanluan <xiaofan.luan@zilliz.com>
2024-03-20 11:35:07 +08:00
yihao.dai
a434d33e75
feat: Add import scheduler and manager (#29367)
This PR introduces novel managerial roles for importv2:
1. ImportMeta: To manage all the import tasks;
2. ImportScheduler: To process tasks and modify their states;
3. ImportChecker: To ascertain the completion of all tasks and instigate
relevant operations.

issue: https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-03-01 18:31:02 +08:00
XuanYang-cn
2867f50fcc
fix: Clear DN unkown compaction tasks (#30850)
If DC restarted,  those unkonwn compaction tasks
will never get call back in DN, so that the segments in the compaction
task will be locked, unable to sync and compaction again, blocking cp
advance and compaction executing.

See also: #30137

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-03-01 11:31:00 +08:00
yihao.dai
a5a4ca8459
enhance: Remove debug log (#30955)
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-03-01 10:02:59 +08:00
yiwangdr
c6665c2a4c
test: support multiple data/querynodes in integration test (#30618)
issue: https://github.com/milvus-io/milvus/issues/29507

Signed-off-by: yiwangdr <yiwangdr@gmail.com>
2024-02-21 11:54:53 +08:00
wayblink
f976385421
enhance: replace binlogIO with io.BinlogIO in datanode (#29725)
#30633

Signed-off-by: wayblink <anyang.wang@zilliz.com>
2024-02-20 14:38:51 +08:00
yihao.dai
7ce876a072
fix: Decoupling importing segment from flush process (#30402)
This pr decoups importing segment from flush process by:
1. Exclude the importing segment from the flush policy, this approch
avoids notifying the datanode to flush the importing segment, which may
not exist.
2. When RootCoord call Flush, DataCoord directly set the importing
segment state to `Flushed`.

issue: https://github.com/milvus-io/milvus/issues/30359

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-02-03 13:01:12 +08:00
XuanYang-cn
fb5e09d94d
fix: call injectDone after compaction failed (#30277)
syncMgr.Block() will lock the segment when executing compaction.

Previous implementation was unable to Unblock thoese segments when
compaction failed. If next compaction of the same segments arrives,
it'll stuck forever and block all later compation tasks.

This PR makes sure compaction executor would Unblock these segments
after a failure compaction.

Apart form that, this PR also refines some logs and clean some codes of
compaction, compactor:

1. Log segment count instead of segmentIDs to avoid logging too many
segments
2. Flush RPC returns L1 segments only, skip L0 and L2
3. CompactionType is checked in `Compaction`, no need to check again
inside compactor
4. Use ligter method to replace `getSegmentMeta`
5. Log information for L0 compaction when encounters an error

See also: #30213

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-02-01 14:25:04 +08:00
yihao.dai
c5918290e6
feat: Add import executor and manager for datanode (#29438)
This PR introduces novel importv2 roles for datanode:
1. Executor: To execute tasks, a import task will be divided into the
following steps: read data -> hash data -> sync data;
2. Manager: To manage all the tasks;

issue: https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-01-31 20:45:04 +08:00
congqixia
fc0d007bd1
enhance: Add MemoryHighSyncPolicy back to write buffer manager (#29997)
See also #27675

This PR adds back MemoryHighSyncPolicy implementation. Also change
MinSegmentSize & CheckInterval to configurable param item.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-01-31 19:03:04 +08:00
congqixia
0c7a96b48d
enhance: Make compaction log has traceID (#30338)
See also #30167

After support open telemetry tracing, we want to have traceID as well,
this PR adds util functions to set traceID with span & propagate traceID
between different context.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-01-30 10:09:03 +08:00
congqixia
6a73860815
enhance: Add open telemetry tracing for compaction (#30168)
Resolves #30167

This PR add tracing for all compaction from the task start in datacoord
and execution procedures in datanode.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-01-23 10:37:00 +08:00
yihao.dai
8780d65b66
fix: Use channel cp as the dml&start position for import segments (#30107)
This PR discontinuing the subscription to the mq and, instead, employing
the channel checkpoint as the DML and starting position for the import
segments.

issue: https://github.com/milvus-io/milvus/issues/30106

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-01-22 14:36:55 +08:00
XuanYang-cn
86f48861c1
fix: Add more throughput in related metrics (#30038)
This PR also fixes bugs in l0 compactor where
l0 results would never be removed from datanode

See also: #30099

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-01-19 11:34:54 +08:00
smellthemoon
e52ce370b6
enhance:don't store logPath in meta to reduce memory (#28873)
don't store logPath in meta to reduce memory, when service get
segmentinfo, generate logpath from logid.
#28885

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-01-18 22:06:31 +08:00
XuanYang-cn
632d8b3743
enhance: Change DN channelmanger into interface (#29307)
See also: #28854

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2023-12-27 16:00:48 +08:00
yihao.dai
d26b563a8b
feat: Define import API and metadata (#28731)
Define the new rpc and metadata for ImportV2.

see also: https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2023-12-04 19:56:35 +08:00
congqixia
393b1f943c
fix: Reject compaction task with growing segments (#28925)
See also #28924
The compaction task generated before datanode finish `SaveBinlogPath`
grpc call contains segments which are still in Growing state DataNode
shall verify each non-levelzero segments before submit compaction task
to executor

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-12-04 19:06:40 +08:00
XuanYang-cn
e62edb991a
enhance: Add FlowgraphManager interface (#28852)
- Change flowgraphManager to fgManagerImpl
- Change close to stop
- change execute to controlMemWaterLevel
- Change method name of fgManager for readability
- Add mockery for fgmanager

Issue: #28853

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2023-11-30 18:42:32 +08:00
XuanYang-cn
aae7e62729
feat: Add levelzero compaction in DN (#28470)
See also: #27606

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2023-11-30 14:30:28 +08:00
XuanYang-cn
321c5c32e3
fix: Separate schedule and check results loop (#28692)
This PR:

- Separates compaction scheduler and check results loop So that slow in
check-loop doesn't influence execution.

- Cleans compaction tasks when drop a vchannel so dropped-channel's
compaction tasks won't be checked over and over again.

  - Skips meta change when meta's already changed, avoid panic
  - Remove not inuse injectDone(bool) parameter

See also: #28628, #28209

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2023-11-29 10:50:29 +08:00
smellthemoon
79c0edb1d8
enhance:Remove msgbase unnecessary assignments (#28511)
remove some unnecessary assignments, for the reason that
commonpbutil.NewMsgBase has default value.

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2023-11-24 15:02:39 +08:00
XuanYang-cn
55800ade84
fix: Remove logging extra segmentIDs (#28662)
Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2023-11-23 14:40:23 +08:00
congqixia
af18aa709b
fix: Remove not needed BlockAll call in SyncSegments (#28632)
See also #28628
Previous compaction task blocked the segment sync task and may block the
flowgraph when sync task is generated by auto sync policy This
`BlockAll` call will block forever and cause whole fg stuck

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-11-22 14:52:23 +08:00
smellthemoon
73f2bab454
enhance:add some log when create client and get component states (#28160)
/kind improvement

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2023-11-22 09:12:22 +08:00
congqixia
0b905078e7
Use writebuffer, sync manager refactory in datanode (#28320)
See also #27675
This PR make previously merged refactory of datanode go online
- Use write node to replace insert/delete node
- Use write buffer manager to control all buffers
- Use sync manager to control sync tasks instead of flush manager

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-11-15 15:24:18 +08:00
XuanYang-cn
40d5c902b6
Enable getting multiple segments in plan result (#28350)
Compaction plan result contained one segment for one plan. For l0
compaction would write to multiple segments, this PR expand the segments
number in plan results and refactor some names for readibility.

- Name refactory: - CompactionStateResult -> CompactionPlanResult -
CompactionResult -> CompactionSegment

See also: #27606

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2023-11-14 15:56:19 +08:00
groot
3f6b203018
Fix bulkinsert bug that segments are compacted after import (#28192)
Signed-off-by: yhmo <yihua.mo@zilliz.com>
2023-11-07 15:14:26 +08:00
congqixia
c41df18b6d
Add compaction meta in SyncSegmentsRequest (#28199)
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-11-07 10:06:17 +08:00
Jiquan Long
a21042dde7
Fix flushManager.isFull is too slow (#28141)
Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2023-11-03 14:42:17 +08:00
smellthemoon
403c6680cc
Add error description (#27959)
Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2023-10-27 10:10:13 +08:00
jaime
e386a62fae
Remove recollect segment stats during starting datacoord (#27410)
Signed-off-by: jaime <yun.zhang@zilliz.com>
2023-10-16 10:26:09 +08:00
congqixia
82b2edc4bd
Replace manual composed grpc call with Broker methods (#27676)
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-10-13 09:55:34 +08:00