Commit Graph

20168 Commits

Author SHA1 Message Date
Gao
f6cd84161c
enhance: ensure autoindex default metric type compatibility (#34479)
issue: #34304 
pr: #34261

Signed-off-by: chasingegg <chao.gao@zilliz.com>
2024-07-08 19:48:14 +08:00
Chun Han
2f38483418
fix: lose partitionIDs when scalar pruning and refine segment prune ratio metrics(#30376) (#34475)
related: #30376
pr: https://github.com/milvus-io/milvus/pull/34477

Signed-off-by: MrPresent-Han <chun.han@zilliz.com>
2024-07-08 19:44:13 +08:00
wayblink
56a74e72f7
fix: [cherry-pick]fix can't enqueue when compaction queue is full(#34445) (#34469)
issue: #30633
pr: #34445

Signed-off-by: wayblink <anyang.wang@zilliz.com>
2024-07-08 19:20:13 +08:00
yihao.dai
0d7ba810b3
enhance: Check segment existence when FlushSegments and add some key logs (#34438) (#34472)
Check if the segment exists during FlushSegments and add some key logs
in write path.

issue: https://github.com/milvus-io/milvus/issues/34255

pr: https://github.com/milvus-io/milvus/pull/34438

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-07-08 19:00:13 +08:00
yihao.dai
0732167c87
fix: Fix incorrect segment num rows (#34441) (#34474)
Repeated calls to UpdateStatistics, this PR correct it.

issue: https://github.com/milvus-io/milvus/issues/34440

pr: https://github.com/milvus-io/milvus/pull/34441

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-07-08 17:30:12 +08:00
XuanYang-cn
26ac76944b
fix: Accidently exit the check loop (#34480)
See also: #34460
pr: #34481

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-07-08 17:14:15 +08:00
congqixia
7c3c9c2ed4
fix: [2.4] Add nbits parameter check for IVF_PQ (#34451) (#34473)
Cherry-pick from master
pr: #34451
See also #34426

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-07-08 17:10:12 +08:00
zhuwenxing
f2d0517f96
test: [cherry-pick]add testcase for count query (#34471)
pr: https://github.com/milvus-io/milvus/pull/34453

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2024-07-08 12:54:12 +08:00
Gao
a60e2a65ff
enhance: change autoindex default metric type (#34277)
issue: #34304 
pr: #34261

Signed-off-by: chasingegg <chao.gao@zilliz.com>
2024-07-08 10:52:14 +08:00
congqixia
014820e9d2
fix: [2.4] Write padding into mmap file in case of SIGBUS (#34443) (#34455)
Cherry-pick from master
pr: #34443
See also #34442

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-07-08 10:06:19 +08:00
jaime
326370c1be
enhance: add disk quota and max collections into db properties (#34386)
issue: https://github.com/milvus-io/milvus/issues/34385
pr: #34368

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-07-05 18:22:17 +08:00
wei liu
d3e94f9861
enhance: Use Blocked Bloom Filter instead of basic bloom fitler impl (#34377)
issue: #32995
pr: #33405
To speed up the construction and querying of Bloom filters, we chose a
blocked Bloom filter instead of a basic Bloom filter implementation.

WARN: This PR is compatible with old version bf impl, but if fall back
to old milvus version, it may causes bloom filter deserialize failed.

In single Bloom filter test cases with a capacity of 1,000,000 and a
false positive rate (FPR) of 0.001, the blocked Bloom filter is 5 times
faster than the basic Bloom filter in both querying and construction, at
the cost of a 30% increase in memory usage.

Block BF construct time {"time": "54.128131ms"}
Block BF size {"size": 3021578}
Block BF Test cost {"time": "55.407352ms"}
Basic BF construct time {"time": "210.262183ms"}
Basic BF size {"size": 2396308}
Basic BF Test cost {"time": "192.596229ms"}
In multi Bloom filter test cases with a capacity of 100,000, an FPR of
0.001, and 100 Bloom filters, we reuse the primary key locations for all
Bloom filters to avoid repeated hash computations. As a result, the
blocked Bloom filter is also 5 times faster than the basic Bloom filter
in querying.

Block BF TestLocation cost {"time": "529.97183ms"}
Basic BF TestLocation cost {"time": "3.197430181s"}

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-07-05 17:04:10 +08:00
zhagnlu
173c02902e
enhance: refactor variable column to reduce memory cost (#33875) (#34367)
cherry-pick commit from master:
pr: #33875

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-07-05 15:52:09 +08:00
congqixia
f5a0353fd1
enhance: [2.4] Continue loop when reassign channel fails (#34331) (#34425)
Cherry-pick from master
pr: #34331
Log will be confusing when `Reassign` channel operation failed for both
success & failure log will be printed in row. This PR continue the loop
to avoid this output.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-07-05 11:14:08 +08:00
Gao
261b61e875
fix: centroids file not removed when data skew in major compaction (#34359)
issue: https://github.com/milvus-io/milvus/issues/30633
pr: #34050

Signed-off-by: chasingegg <chao.gao@zilliz.com>
2024-07-05 10:42:10 +08:00
zhagnlu
74da97796b
enhance: Enhance and correct exception module (#34366)
cherry-pick commit from master:
pr: #33705

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-07-05 10:06:10 +08:00
PowderLi
ba2c232331
fix: [cherry-pick] [restful v2] count(*) & hook (#34433)
issue: #31224 #34374
pr: #34369

for query api:

1. param filter is not requried
2. param limit is useless while outputFields = [count(*)]

add hook about grpc call

---------

Signed-off-by: PowderLi <min.li@zilliz.com>
2024-07-05 09:52:10 +08:00
zhagnlu
190a2ca7b8
enhance: reduce cpp ut test cost time (#34414)
#34413
cherry-pick part from master commit: pr: #33358

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-07-04 20:40:10 +08:00
congqixia
b402485292
fix: [2.4] Skip l0 segments when syncing segments to datanodes (#34389)
Cherry-pick from master
pr: #34388
See also #34387

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-07-04 18:02:14 +08:00
yihao.dai
a57c9e61fc
enhance: [cherry-pick] optimize datanode cpu usage and correct the update logic of ttchecker (#34383)
This PR cherry-picks the following commits:
- Try to improve cpu usage by refactoring the ttchecker logic and
caching string. https://github.com/milvus-io/milvus/pull/33267
- Correct the update logic of timerecorder in the flowgraph to avoid
false failure: "some node(s) haven't received input".
https://github.com/milvus-io/milvus/pull/34339

issue: https://github.com/milvus-io/milvus/issues/33266,
https://github.com/milvus-io/milvus/issues/34337

pr: https://github.com/milvus-io/milvus/pull/33267,
https://github.com/milvus-io/milvus/pull/34339

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
Co-authored-by: Xiaofan <83447078+xiaofan-luan@users.noreply.github.com>
2024-07-04 16:34:17 +08:00
Chun Han
5831908aa2
enhance: reconstruct scalar part's code for segment-pruner(#30376) (#34365)
related: #30376
pr: https://github.com/milvus-io/milvus/pull/34346
1. support more complex expr
2. add more ut test for unrelated fields

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2024-07-04 16:30:10 +08:00
XuanYang-cn
0f1915ef24
fix: DataNode might OOM by estimating based on MemorySize (#34203)
See also: #34136
pr: #34201

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-07-04 15:24:10 +08:00
shaoting-huang
dd4dfbcd8d
enhance: [cherry-pick] Batch pick PRs related to data codec (#34345)
This PR cherry-picks the following commits related to data codec
- Fix data codec writer close. #33818
- Legacy code clean up. #33838

issue: #33813 #33839 

pr: #33818 #33838

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2024-07-04 15:08:11 +08:00
foxspy
b33243f236
enhance: Update Knowhere version (#34405)
1. cherry-pick pr #34223 (update the parameter dataset from reference to
share_ptr)
2. update knowhere version from v2.3.5 to v2.3.6
(https://github.com/zilliztech/knowhere/releases/tag/v2.3.6)

/kind branch-feature

---------

Signed-off-by: xianliang.li <xianliang.li@zilliz.com>
Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>
Co-authored-by: Cai Yudong <yudong.cai@zilliz.com>
2024-07-04 14:46:11 +08:00
wayblink
b3aec4c8e1
fix:[cherry-pick] minor fixs for major compaction (#34402)
This PR cherry-picks the following commits:

- fix: Avoid datarace in clustering compaction #34288
- fix: remove isFull check in compaction.enqueue #34338

issue: #30633 
pr: #34288 #34338

---------

Signed-off-by: wayblink <anyang.wang@zilliz.com>
2024-07-04 14:44:10 +08:00
chyezh
a1a0a56f86
enhance: async search and retrieve in cgo (#34200)
issue: #33132
pr: #33133
other pr: #33228, #34084, #33946

- implement future-based cgo utility
- async search and retrieve in cgo
- modify gc configuration document

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2024-07-04 13:02:09 +08:00
congqixia
3efb78e154
enhance: [2.4] Tag gotestsum version when install deps (#34308) (#34401)
Cherry-pick from master
pr: #34308
Tagging gotestsum by ldflags to prevent reinstall gotestsum binary each
local run

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-07-04 10:30:09 +08:00
Chun Han
e12b701c03
enhance: add metrics for segment prune latnecy(#30376) (#34364)
related: #30376
pr: https://github.com/milvus-io/milvus/pull/34094

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2024-07-04 10:14:09 +08:00
aoiasd
9087b6f42e
enhance: [Cherry-Pick] support mark error as user error (#33498) (#34396)
relate: https://github.com/milvus-io/milvus/issues/33492
pr: https://github.com/milvus-io/milvus/pull/33498
---------

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-07-04 10:08:10 +08:00
Chun Han
014cb7b071
enhance: use configed max topk for iterator when input topk exceeds(#34292) (#34293)
related: #34292
pr: https://github.com/milvus-io/milvus/pull/34290

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2024-07-04 09:58:09 +08:00
cai.zhang
bc1746f96c
enhance: [cherry-pick] Optimize clustering compaction (#34313) (#34398)
issue: #30633

master pr: #34313

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-07-04 09:52:09 +08:00
jaime
3e0034bea2
enhance: cherry pick some improved PRs from the master branch (#34391)
issue:
https://github.com/milvus-io/milvus/issues/33205,https://github.com/milvus-io/milvus/issues/33342
pr: https://github.com/milvus-io/milvus/pull/33530
pr: #33343
pr: #33206

---------

Signed-off-by: jaime <yun.zhang@zilliz.com>
Co-authored-by: xiaofanluan <xiaofan.luan@zilliz.com>
Co-authored-by: Xiaofan <83447078+xiaofan-luan@users.noreply.github.com>
2024-07-03 19:40:11 +08:00
aoiasd
07daa8f12b
enhance:[Cherry-pick] avoid maintain checkpoint info in sync manager (#33413) (#34285)
relate: https://github.com/milvus-io/milvus/issues/32915
pr: https://github.com/milvus-io/milvus/pull/33413

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-07-03 19:02:09 +08:00
wayblink
ec91e556fe
enhance: [cherry-pick] Refine clustering_compaction_task retry mechanism (#34384)
issue: #30633 
pr: #34194

Signed-off-by: wayblink <anyang.wang@zilliz.com>
2024-07-03 18:20:17 +08:00
aoiasd
7000cec365
enhance: [Cherry-pick] Merge query stream result for reduce delete task (#32855) (#34281)
relate: https://github.com/milvus-io/milvus/issues/32854
pr:  https://github.com/milvus-io/milvus/pull/32855

---------

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-07-03 18:08:09 +08:00
cai.zhang
0c01ace0d2
fix: [cherry-pick] Only load or release Flushed segment in datanode meta (#34393)
issue: #34376 ,  #34375,  #34379

master pr: #34390

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-07-03 17:44:11 +08:00
congqixia
945f0106f6
fix: [2.4] Use raw parameter value to perform CAS (#34373)
Cherry-pick from master
pr: #34343
See also #34342

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-07-03 17:20:10 +08:00
cai.zhang
e944c308c7
enhance: [cherry-pick] Skip pick worker when task doesn't need to execute actually (#34382)
issue: #34347 

master pr: #34348

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-07-03 15:54:08 +08:00
aoiasd
18668aaace
enhance:[Cherry-Pick] change access log write cache default config (#34352)
pr: https://github.com/milvus-io/milvus/pull/34351

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-07-03 15:10:09 +08:00
Patrick Weizhi Xu
88550081c3
enhance: [skip e2e][2.4] update the version of MV (#34380)
issue: #29892 
master PR: https://github.com/milvus-io/milvus/pull/34378

Signed-off-by: Patrick Weizhi Xu <weizhi.xu@zilliz.com>
(cherry picked from commit 92f3f4521a67fcc72074a4eb5a0bb644960293d9)
2024-07-03 14:28:16 +08:00
SimFG
b3c5eb29ed
enhance: [2.4] the proxy metric in the query request (#34356)
/kind improvement
- issue: #33306
- pr: #33307

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-07-03 10:00:09 +08:00
wayblink
c62bf8a0b0
fix: [Cherry-pick]Pick major compaction fixs and optimizations (#34360)
This PR cherry-picks the following commits:

- fix: sync partitiion stats blocking balance task #33742
- fix: Fix meta prefix overlap bug #33830
- fix: Small fixs of major compaction #33929 
- fix: Fix memory buffer error & some renaming #33850
- fix: sync part stats task cannot be finished #34027 
- Add an option to enable/disable vector field clustering key #34097
- fix: fix error ignore in compactor #34169
- fix:load major compaction partial result #34052
- Use new stream segment reader in clustering compaction #34232

issue: #30633
pr: #33742 #33830 #33929 #33850 #34027 #34097 #34169 #34052 #34232

---------

Signed-off-by: MrPresent-Han <chun.han@zilliz.com>
Signed-off-by: wayblink <anyang.wang@zilliz.com>
Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: Chun Han <116052805+MrPresent-Han@users.noreply.github.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2024-07-03 09:53:37 +08:00
zhenshan.cao
760b3fafd5
enhance: Refine compaction (#33982) (#34363)
This PR cherry-picks the following commits related to data compaction:
- enhance: Refine compaction.
[#33982](https://github.com/milvus-io/milvus/pull/33982)
- fix l0 compaction may miss some sealed segments.
[#33838](https://github.com/milvus-io/milvus/pull/33980)

issue : https://github.com/milvus-io/milvus/issues/32939
https://github.com/milvus-io/milvus/issues/33955

pr : https://github.com/milvus-io/milvus/pull/33982

Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2024-07-03 09:52:49 +08:00
elstic
fb88267855
test: [cherry-pick] update test case (#34109)
pr: https://github.com/milvus-io/milvus/pull/34108

Signed-off-by: elstic <hao.wang@zilliz.com>
2024-07-02 22:04:08 +08:00
cai.zhang
6cb0f1ff74
fix: [cherry-pick] Sync the sealed and flushed segments to datanode (#34301) (#34318)
issue: #33696

master pr: #34301

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-07-02 19:36:09 +08:00
wei liu
ad545b6fa6
enhance: refine misleading param name for bloom filter parallel factor (#34335)
pr: #34334

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-07-02 19:24:09 +08:00
congqixia
6b348e4e91
enhance: [2.4] Add go-deadlock as unittest only dependency (#33063) (#34322)
Cherry-pick from master
pr: #33063
See also #33062

This PR:

- Add lock.RWMutex & lock.Mutex alias to switch implementation based on
build flags
- When build flags has test in it, use go-deadlock to detect possible
deadlocks
- Replace all sync.RWMutex & sync.Mutex in datacoord pkg

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-07-02 18:48:10 +08:00
wayblink
99586066f5
feat: [cherry-pick] Major compaction (#34326)
This PR cherry-picks the following commits:
fix: speed up segment lookup via channel name in datacoord (#33530)
needed by the next commit
  feat: Major compaction (#33620)

issue: #30633
pr: #33620

---------

Signed-off-by: yiwangdr <yiwangdr@gmail.com>
Signed-off-by: wayblink <anyang.wang@zilliz.com>
Co-authored-by: yiwangdr <80064917+yiwangdr@users.noreply.github.com>
Co-authored-by: MrPresent-Han <chun.han@zilliz.com>
2024-07-02 18:29:01 +08:00
cai.zhang
f11e421839
enhance: [cherry-pick] Remove compaction plans on the datanode (#33548) (#34312)
issue: #33546

master pr: #33548

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-07-02 17:39:08 +08:00
congqixia
975f3bbeab
enhance: [2.4] Refine max length exceeded error message (#34300) (#34323)
Cherry-pick from master
pr: #34300
This PR make varchar & string array field max length exceeded error
message clearer. Also fixed a minor issue that error string format and
argument number not match.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-07-02 15:52:09 +08:00