Commit Graph

313 Commits

Author SHA1 Message Date
cai.zhang
1c6e850f73
enhance: [cherry-pick] Periodically synchronize segments to datanode watcher (#33420) (#34186)
This PR primary picks up the SyncSegments functionality, including the
following commits:
- main functionality: https://github.com/milvus-io/milvus/pull/33420
- related fixes:
  - https://github.com/milvus-io/milvus/pull/33664
  - https://github.com/milvus-io/milvus/pull/33829
  - https://github.com/milvus-io/milvus/pull/34056
  - https://github.com/milvus-io/milvus/pull/34156

issue: #32809 
master pr: #33420, #33664, #33829, #34056, #34156

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-06-27 11:24:05 +08:00
jaime
6423b6c718
enhance: move rocksmq from internal to pkg (#34165)
pr:  https://github.com/milvus-io/milvus/pull/33881
issue:  https://github.com/milvus-io/milvus/issues/33956

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-06-26 13:36:05 +08:00
XuanYang-cn
a33b68678d
enhance: [cherry-pick] Move compactor into sub package (#34098)
This PR consists of the following commits:

- enhance: Tidy compactor and remove dup codes (#32198)
- fix: Fix l0 compactor may cause DN from OOM (#33554)
- enhance: Add deltaRowCount in l0 compaction (#33997)
- enhance: enable stream writer in compactions (#32612)
- fix: turn on compression on stream writers (#34067)
- fix: adding blob memory size in binlog serde (#33324)

See also: #32451, #33547, #33998, #31679
pr: #32198, #33554, #33997, #32612

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
Signed-off-by: Ted Xu <ted.xu@zilliz.com>
Co-authored-by: Ted Xu <ted.xu@zilliz.com>
2024-06-25 11:16:02 +08:00
yihao.dai
e282e1408e
enhance: Abstract Execute interface for import/preimport task (#33234) (#33607)
Abstract Execute interface for import/preimport task, simplify import
scheduler.

issue: https://github.com/milvus-io/milvus/issues/33157

pr: https://github.com/milvus-io/milvus/pull/33234

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-06-05 11:17:56 +08:00
yihao.dai
a984e46a29
enhance: Remove rootcoord from datanode broker (#32818)
issue: https://github.com/milvus-io/milvus/issues/32827

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-05-14 10:03:32 +08:00
yiwangdr
d6e537c91c
fix: allow datanode's server id to be updated (#31597)
issue: #31516

background: the server id field in data node is redundant. session id
already provides the source of truth.

Signed-off-by: yiwangdr <yiwangdr@gmail.com>
2024-05-08 14:03:29 +08:00
yiwangdr
b1eacb2ae8
feat: datacoord/node watch based on rpc (#32036)
issue: https://github.com/milvus-io/milvus/issues/25309

Signed-off-by: yiwangdr <yiwangdr@gmail.com>
2024-05-07 15:49:30 +08:00
SimFG
1af084ea6b
enhance: Make datanode exit and case TestProxy faster (#32218)
/kind improvement
issue: #32219

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-04-16 10:49:20 +08:00
XuanYang-cn
aad3ed3835
fix: [cherry-pick]Skip changing meta if nodeID not match with channel (#31672)
See also: #31648
pr: #31665, #31694

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-04-10 15:09:18 +08:00
yihao.dai
4e264003bf
enhance: Ensure ImportV2 waits for the index to be built and refine some logic (#31629)
Feature Introduced:
1. Ensure ImportV2 waits for the index to be built

Enhancements Introduced:
1. Utilization of local time for timeout ts instead of allocating ts
from rootcoord.
3. Enhanced input file length check for binlog import.
4. Removal of duplicated manager in datanode.
5. Renaming of executor to scheduler in datanode.
6. Utilization of a thread pool in the scheduler in datanode.

issue: https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-04-01 20:09:13 +08:00
XuanYang-cn
39337e09b8
fix: Using zero serverID for metrics (#31518)
Fixes: #31516

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-04-01 16:55:19 +08:00
congqixia
d9efea2fea
fix: Cleanup write buffer when flowgraph released (#31376)
See also #30137

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-03-19 01:33:05 +08:00
jaime
db79be3ae0
fix: ctx cancel should be the last step while stopping server (#31220)
issue: #31219

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-03-15 10:33:05 +08:00
XuanYang-cn
a52a52064d
fix: Use lock and map instead of concurrentMap (#31212)
See also: #31209

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-03-14 18:39:04 +08:00
yihao.dai
c411cb4a49
enhance: Prevent the backlog of channelCP update tasks, perform batch updates of channelCPs (#30941)
This PR includes the following adjustments:
1. To prevent channelCP update task backlog, only one task with the same
vchannel is retained in the updater. Additionally, the lastUpdateTime is
refreshed after the flowgraph submits the update task, rather than in
the callBack function.
2. Batch updates of multiple vchannel checkpoints are performed in the
UpdateChannelCheckpoint RPC (default batch size is 128). Additionally,
the lock for channelCPs in DataCoord meta has been switched from key
lock to global lock.
3. The concurrency of UpdateChannelCheckpoint RPCs in the datanode has
been reduced from 1000 to 10.

issue: https://github.com/milvus-io/milvus/issues/30004

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
Co-authored-by: jaime <yun.zhang@zilliz.com>
Co-authored-by: congqixia <congqi.xia@zilliz.com>
2024-03-07 20:39:02 +08:00
chyezh
0c7474d7e8
enhance: add graceful stop timeout to avoid node stop hang under extreme cases (#30317)
1. add coordinator graceful stop timeout to 5s
2. change the order of datacoord component while stop
3. change querynode grace stop timeout to 900s, and we should
potentially change this to 600s when graceful stop is smooth

issue: #30310
also see pr: #30306

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2024-02-29 17:01:50 +08:00
yiwangdr
c6665c2a4c
test: support multiple data/querynodes in integration test (#30618)
issue: https://github.com/milvus-io/milvus/issues/29507

Signed-off-by: yiwangdr <yiwangdr@gmail.com>
2024-02-21 11:54:53 +08:00
yihao.dai
c5918290e6
feat: Add import executor and manager for datanode (#29438)
This PR introduces novel importv2 roles for datanode:
1. Executor: To execute tasks, a import task will be divided into the
following steps: read data -> hash data -> sync data;
2. Manager: To manage all the tasks;

issue: https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-01-31 20:45:04 +08:00
congqixia
fc0d007bd1
enhance: Add MemoryHighSyncPolicy back to write buffer manager (#29997)
See also #27675

This PR adds back MemoryHighSyncPolicy implementation. Also change
MinSegmentSize & CheckInterval to configurable param item.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-01-31 19:03:04 +08:00
chyezh
6d63fb5d3f
fix: panic with datanode negetive wait group counter (#30135)
issue: #29170

Signed-off-by: chyezh <chyezh@outlook.com>
2024-01-30 18:15:04 +08:00
XuanYang-cn
632d8b3743
enhance: Change DN channelmanger into interface (#29307)
See also: #28854

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2023-12-27 16:00:48 +08:00
SimFG
dd9c61831d
enhance: Support to get the param value in the runtime (#29297)
/kind improvement
issue: #29299

Signed-off-by: SimFG <bang.fu@zilliz.com>
2023-12-22 18:36:44 +08:00
congqixia
4731c1b0d5
enhance: make SyncManager pool size refreshable (#29224)
See also #29223

This PR make `conc.Pool` resizable by adding `Resize` method for it. 
Also make newly added datanode `MaxParallelSyncMgrTasks` config
refreshable

---------

Signed-off-by: Congqi.Xia <congqi.xia@zilliz.com>
2023-12-15 09:58:43 +08:00
congqixia
25a4525297
enhance: Change sync manager parallel config item (#29216)
Since the sync manager is global in datanode now, the old
`maxParallelSyncTaskNum` does not fit into current implementation
anymore.

This PR add a new param item for sync mgr parallel control and enlarge
default value

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-12-14 20:46:41 +08:00
wayblink
51f870da7e
feat: Introduce channelCheckpointUpdater to reduce goroutine use in ttNode (#28570)
/kind enhancement

Signed-off-by: wayblink <anyang.wang@zilliz.com>
2023-12-12 13:48:42 +08:00
XuanYang-cn
e62edb991a
enhance: Add FlowgraphManager interface (#28852)
- Change flowgraphManager to fgManagerImpl
- Change close to stop
- change execute to controlMemWaterLevel
- Change method name of fgManager for readability
- Add mockery for fgmanager

Issue: #28853

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2023-11-30 18:42:32 +08:00
jaime
b1e0a27f31
enhance: Add logs for each step during service initialization (#28624)
/kind improvement

Signed-off-by: jaime <yun.zhang@zilliz.com>
2023-11-27 16:30:26 +08:00
congqixia
a2fe9dad49
enhance: Make etcd kv request timeout configurable (#28661)
See also #28660
This pr add request timeout config item for etcd kv request timeout
 Sync the default timeout value to same value for etcdKV & tikv config

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-11-23 19:34:23 +08:00
congqixia
bed7467f20
enhance: Remove commented code and fix naming issue (#28450)
This PR removes all the commented code and files from PR #28320

For naming issue:
- Renaming `MinCheckpoint` to `EarliestPosition`, see #28320 comment
- Renaming `writebuffer.Mananger` to `BufferMananger`, see #27874
comment

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-11-16 00:22:20 +08:00
congqixia
0b905078e7
Use writebuffer, sync manager refactory in datanode (#28320)
See also #27675
This PR make previously merged refactory of datanode go online
- Use write node to replace insert/delete node
- Use write buffer manager to control all buffers
- Use sync manager to control sync tasks instead of flush manager

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-11-15 15:24:18 +08:00
congqixia
b1eb1ea506
Refine datanode Timetick Sender (#28393)
- Use explicit lifetime control methods: `Start` and `Stop`
- Allow control retry option
- Make sure tt sender worker exit after `Stop` return

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-11-13 21:46:20 +08:00
SimFG
e3b7fdac61
Delay the cancellation of ctx when stopping the node (#28247)
Signed-off-by: SimFG <bang.fu@zilliz.com>
2023-11-08 03:20:17 +08:00
Filip Haltmayer
6b1a106a31
Moving etcd client into session (#27069)
Signed-off-by: Filip Haltmayer <filip.haltmayer@zilliz.com>
2023-10-27 07:36:12 +08:00
jaime
ec1fe3549e
Add a stop hook to clean session (#27564)
Signed-off-by: jaime <yun.zhang@zilliz.com>
2023-10-16 10:24:10 +08:00
congqixia
82b2edc4bd
Replace manual composed grpc call with Broker methods (#27676)
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-10-13 09:55:34 +08:00
XuanYang-cn
56c94cdfa7
Add channel manager in DataNode (#27308)
Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2023-10-08 21:37:33 +08:00
XuanYang-cn
5c5f9aa05e
Enhance newDataSyncService (#27277)
- Add flowgraph.Assemble assembles nodes in flowgraph.go
- remove fgCtx in newDataSyncService
- Add newServiceWithEtcdTickler func, reduce param numbers to 3
- Remove unnecessary params
  - config.maxQueueLength, config.maxParallelish

See also: #27207

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2023-09-27 11:07:25 +08:00
jaime
7f7c71ea7d
Decoupling client and server API in types interface (#27186)
Co-authored-by:: aoiasd <zhicheng.yue@zilliz.com>

Signed-off-by: jaime <yun.zhang@zilliz.com>
2023-09-26 09:57:25 +08:00
SimFG
26f06dd732
Format the code (#27275)
Signed-off-by: SimFG <bang.fu@zilliz.com>
2023-09-21 09:45:27 +08:00
XuanYang-cn
09505ea78e
Move etcd watch related code into eventmanager (#27192)
Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2023-09-19 15:27:38 +08:00
yihao.dai
4b2802033d
Fix datanode panic due to concurrent compaction and delete processing (#27167)
Co-authored-by: xiaofan-luan <xiaofan.luan@zilliz.com>
2023-09-18 19:35:29 +08:00
yah01
00c65fa0d7
Refine QueryNode errors (#27013)
Signed-off-by: yah01 <yah2er0ne@outlook.com>
2023-09-12 16:07:18 +08:00
XuanYang-cn
84253f255e
Fix datanode graceful stop panic (#25932)
See also: #25925

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2023-07-28 10:11:08 +08:00
SimFG
f9e2d00f91
Prevent exclusive consumer exception in pulsar (#25376)
Signed-off-by: SimFG <bang.fu@zilliz.com>
2023-07-12 17:26:30 +08:00
wayblink
fc12d3997c
Rename newTimeTickManager to newTimeTickSender (#25415)
Signed-off-by: wayblink <anyang.wang@zilliz.com>
2023-07-12 14:56:28 +08:00
groot
96c987ed62
Bulkinsert supports partition keys (#25284)
Signed-off-by: yhmo <yihua.mo@zilliz.com>
2023-07-11 15:18:28 +08:00
xige-16
33c2012675
Add more metrics (#25081)
Signed-off-by: xige-16 <xi.ge@zilliz.com>
2023-06-26 17:52:44 +08:00
yiwangdr
c7b851f870
add interface for non-watch metakv (#25092)
Signed-off-by: yiwangdr <yiwangdr@gmail.com>
2023-06-26 09:20:44 +08:00
wayblink
bfae6b49af
Remove datanode timetick mq, use rpc to report instead (#23156)
Signed-off-by: wayblink <anyang.wang@zilliz.com>
2023-06-14 14:16:38 +08:00
congqixia
41af0a98fa
Use go-api/v2 for milvus-proto (#24770)
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-06-09 01:28:37 +08:00