zhagnlu
9248a6a149
fix: remove sve flags ( #32270 )
...
#32129
Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-04-16 15:21:19 +08:00
chyezh
e19d17076f
fix: delete may lost when enable lru cache, some field should be reset when ReleaseData ( #32012 )
...
issue: #30361
- Delete may be lost when segment is not data-loaded status in lru
cache. skip filtering to fix it.
- `stats_` and `variable_fields_avg_size_` should be reset when
`ReleaseData`
- Remove repeat load delta log operation in lru.
---------
Signed-off-by: chyezh <chyezh@outlook.com>
2024-04-16 11:17:20 +08:00
wei liu
0d849a6c0a
fix: fix collectionInfo leak in datacoord ( #32175 )
...
issue: #32029
lack of logic to clean collection info in datacoord's meta, This PR
clean collection info after drop channel, to avoid collection info leak
in datacoord
---------
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-04-15 16:33:19 +08:00
Chun Han
337cc0756d
fix: lack good results for insufficient ef( #29883 ) ( #32080 )
...
related: #29883
Signed-off-by: MrPresent-Han <chun.han@zilliz.com>
2024-04-13 22:13:23 +08:00
Jiquan Long
4fb85be525
fix: put inverted index into local storage ( #32209 )
...
issue: https://github.com/milvus-io/milvus/issues/32154
Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2024-04-13 21:57:19 +08:00
sre-ci-robot
454984aa4e
[automated] Update Knowhere Commit ( #32181 )
...
Update Knowhere Commit
Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2024-04-12 01:25:19 +08:00
Alexander Guzhva
b5455d176e
fix: dynamically resolve whether SVE is available for bitset ( #32137 )
...
Issue: #32129
This PR adds a dynamic SVE detection for ARM CPU families for the bitset
code.
Also, allows the code to be compiled if the compiler does not support
NEON (arm-v7).
Signed-off-by: Alexandr Guzhva <alexanderguzhva@gmail.com>
2024-04-11 19:27:18 +08:00
Patrick Weizhi Xu
52ae47c850
enhance: gather materialized view search info once per request ( #31996 )
...
issue: #29892
This PR:
1. Move the process of gathering materialized search info to when the
search plan is created, before it goes to each segment, to avoid
repeated work and access the plan node under multi-threaded
circumstances.
2. Enforce the supported MV type to `VARCHAR`
3. Add integration test
Signed-off-by: Patrick Weizhi Xu <weizhi.xu@zilliz.com>
2024-04-11 15:21:19 +08:00
Chun Han
f3f2a5a7e9
fix: evicted segments in the serverlss mode( #31959 ) ( #31961 )
...
related: #31959
1. reset segment index status after evicting to lazyload=true
2. reset num_rows to null_opt
Signed-off-by: MrPresent-Han <chun.han@zilliz.com>
2024-04-10 15:15:19 +08:00
Cai Yudong
a0a4ec8b67
enhance: make range search param check message more meaningful ( #32006 )
...
Issue: #31970
Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>
2024-04-09 16:17:26 +08:00
cai.zhang
1b767669a4
enhance: Throw error instead of crash when index cannot be built ( #31844 )
...
issue: #27589
---------
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-04-09 11:51:18 +08:00
chyezh
7b400252ff
fix: add configuration disk capacity config for lru and fix some bug ( #31977 )
...
issue: #30361
- Add configurable disk capacity limit
- fix bitset reset logic
- make insert record reinsert after clear
Signed-off-by: chyezh <chyezh@outlook.com>
2024-04-08 15:55:16 +08:00
cqy123456
aba4993c6c
fix: fix some fp16/bf16 code miss in segcore. ( #31771 )
...
issue:https://github.com/milvus-io/milvus/issues/22837
Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2024-04-07 14:13:16 +08:00
Alexander Guzhva
cae5722229
enhance: performance improvements for the bitset ( #31753 )
...
Issue: #31752
This PR improves the performance for bitset utilities (introduced in PR
#30454 ), including varchar filtering
Signed-off-by: Alexandr Guzhva <alexanderguzhva@gmail.com>
2024-04-06 05:19:22 +08:00
zhagnlu
b2669e26dc
fix:reduce thread pool test time ( #31893 )
...
#31877
Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-04-05 10:05:12 +08:00
zhagnlu
d6d3b01a04
fix:remove thread pool timeout test because of high load cpu ( #31879 )
...
#31877
Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-04-03 15:55:38 +08:00
Jiquan Long
03e0db109e
fix: udpate Cargo.lock ( #31859 )
...
issue: #31681
Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2024-04-03 14:18:23 +08:00
Cai Yudong
246586be27
enhance: Unify data type check APIs under internal/core ( #31800 )
...
Issue: #22837
Move and rename following C++ APIs:
datatype_sizeof() ==> GetDataTypeSize()
datatype_name() ==> GetDataTypeName()
datatype_is_vector() / IsVectorType() ==> IsVectorDataType()
datatype_is_variable() ==> IsVariableDataType()
datatype_is_sparse_vector() ==> IsSparseFloatVectorDataType()
datatype_is_string() / IsString() ==> IsDataTypeString()
datatype_is_floating() / IsFloat() ==> IsDataTypeFloat()
datatype_is_binary() ==> IsDataTypeBinary()
datatype_is_json() ==> IsDataTypeJson()
datatype_is_array() ==> IsDataTypeArray()
datatype_is_variable() == IsDataTypeVariable()
datatype_is_integer() / IsIntegral() ==> IsDataTypeInteger()
Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>
2024-04-02 19:15:14 +08:00
PowderLi
d299fa502e
fix: use milvus-io/vcpkg ( #31770 )
...
issue: #31769
GitHub Disables The XZ Repository because of CVE-2024-3094
Signed-off-by: PowderLi <min.li@zilliz.com>
2024-04-01 15:01:13 +08:00
chyezh
5655ec4fc0
enhance: add mmap usage metrics ( #31708 )
...
issue: #31707
Signed-off-by: chyezh <chyezh@outlook.com>
2024-04-01 11:35:12 +08:00
congqixia
3ffe126dc7
enhance: Refine error message when search vector type not matched ( #31725 )
...
Previously the error message only reports the case happened without
field name and vector type.
This PR add field name and vector type information in the error
messages.
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-04-01 10:11:12 +08:00
Cai Yudong
675a5dc822
fix: Save traceID and spanID as std::vector into search config ( #31278 )
...
Issue: #30961
Signed-off-by: Yudong Cai <yudong.cai@zilliz.com>
2024-03-29 14:29:11 +08:00
Jiquan Long
9750e78f1d
enhance: lock tantivy dependencies ( #31688 )
...
issue: https://github.com/milvus-io/milvus/issues/31681
Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2024-03-29 10:15:17 +08:00
Chun Han
b99c46246c
enhance: ban groupby on binary vector( #31134 ) ( #31659 )
...
related: #31134
Signed-off-by: MrPresent-Han <chun.han@zilliz.com>
2024-03-28 15:19:10 +08:00
Jiquan Long
e33dba8afe
fix: [skip-e2e] use zstd-sys 2.0.9 ( #31682 )
...
fix : #31681
/kind improvement
Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2024-03-28 15:14:10 +08:00
SimFG
b1a1cca10b
feat: add more operation detail info for better allocation ( #30438 )
...
issue: #30436
---------
Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-03-28 06:33:11 +08:00
Jiquan Long
4eb4df1e81
fix: predict inverted index resource usage more reasonably ( #31615 )
...
/kind improvement
issue: #31617
---------
Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2024-03-27 19:33:09 +08:00
congqixia
655097f171
fix: Verify PlaceHolderValue type before search ( #31626 )
...
See also #31625
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-03-27 17:49:10 +08:00
groot
5be395354c
fix: minio ssl compatible issue ( #31607 )
...
issue: https://github.com/milvus-io/milvus/issues/30709
Signed-off-by: yhmo <yihua.mo@zilliz.com>
2024-03-27 14:41:20 +08:00
sre-ci-robot
678cb187e8
[automated] Update Knowhere Commit ( #31630 )
...
Update Knowhere Commit
Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2024-03-27 01:15:10 +08:00
zhagnlu
659ad81ab7
fix: remove deprecated ut test ( #31499 )
...
#31498
Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-03-26 14:01:07 +08:00
Alexander Guzhva
c4b37fb285
enhance: Custom bitset and bitsetview prototypes ( #30454 )
...
Issue: #31285
Basically, I've replaced `FixedVector<bool>` and `boost::dynamic_bitset`
with custom bitset and bitsetview in order to reduce the memory
bandwidth & increase performance for the filtering.
This PR is for internal use only.
Current progress (numbers are for GCC 9.5.0 on Ubuntu 22.04 LTS;
clang-17 produces better performance numbers):
Baseline:
```
[ RUN ] CApiTest.AssembeChunkPerfTest
start test
cost: 17903us
[ OK ] CApiTest.AssembeChunkPerfTest (183 ms)
[ RUN ] Expr.TestMultiLogicalExprsOptimization
cost: 1391us
cost: 5us
cost: 4us
cost: 4us
cost: 6us
cost: 4us
cost: 4us
cost: 4us
cost: 4us
cost: 4us
143
cost: 10us
cost: 8us
cost: 10us
cost: 8us
cost: 8us
cost: 8us
cost: 8us
cost: 8us
cost: 8us
cost: 9us
8
/home/ubuntu/zilliz/milvus4/milvus/internal/core/unittest/test_expr.cpp:1561: Failure
Expected: (cost_op) < (cost_no_op), actual: 143 vs 8
[ FAILED ] Expr.TestMultiLogicalExprsOptimization (7 ms)
[ RUN ] Expr.TestExprs
start test
3cost: 889us
start test
10cost: 2us
start test
20cost: 2us
start test
30cost: 2us
start test
50cost: 3us
start test
100cost: 7us
start test
200cost: 16us
[ OK ] Expr.TestExprs (9 ms)
[ RUN ] Expr.TestUnaryBenchTest
start test type:2
cost: 124.8us
start test type:3
cost: 163.1us
start test type:4
cost: 275.9us
start test type:5
cost: 590.9us
start test type:10
cost: 62.7us
start test type:11
cost: 65.9us
[ OK ] Expr.TestUnaryBenchTest (1153 ms)
[ RUN ] Expr.TestBinaryRangeBenchTest
start test type:2
cost: 151.4us
start test type:3
cost: 198.4us
start test type:4
cost: 361.9us
start test type:5
cost: 753.9us
start test type:10
cost: 64.6us
start test type:11
cost: 62.2us
[ OK ] Expr.TestBinaryRangeBenchTest (1151 ms)
[ RUN ] Expr.TestLogicalUnaryBenchTest
start test type:2
cost: 121.14us
start test type:3
cost: 156.84us
start test type:4
cost: 249.76us
start test type:5
cost: 534.44us
start test type:10
cost: 82.2us
start test type:11
cost: 83.52us
[ OK ] Expr.TestLogicalUnaryBenchTest (1202 ms)
[ RUN ] Expr.TestBinaryLogicalBenchTest
start test type:2
cost: 80.64us
start test type:3
cost: 78.22us
start test type:4
cost: 255.76us
start test type:5
cost: 532.04us
start test type:10
cost: 89.26us
start test type:11
cost: 90us
[ OK ] Expr.TestBinaryLogicalBenchTest (1198 ms)
[ RUN ] Expr.TestBinaryArithOpEvalRangeBenchExpr
start test type:2
cost: 401.7us
start test type:3
cost: 420.96us
start test type:4
cost: 418.04us
start test type:5
cost: 470.54us
start test type:10
cost: 250.32us
start test type:11
cost: 850.08us
[ OK ] Expr.TestBinaryArithOpEvalRangeBenchExpr (1273 ms)
[ RUN ] Expr.TestCompareExprBenchTest
start test type:2
cost: 162us
start test type:3
cost: 142us
start test type:4
cost: 374us
start test type:5
cost: 674us
start test type:10
cost: 366us
start test type:11
cost: 645us
[ OK ] Expr.TestCompareExprBenchTest (1214 ms)
[ RUN ] Expr.TestRefactorExprs
start test
3cost: 1253us
start test
10cost: 1060us
start test
20cost: 681us
start test
30cost: 522us
start test
50cost: 511us
start test
100cost: 506us
start test
200cost: 497us
[ OK ] Expr.TestRefactorExprs (1142 ms)
```
Candidate:
```
[ RUN ] CApiTest.AssembeChunkPerfTest
start test
cost: 6099us
[ OK ] CApiTest.AssembeChunkPerfTest (153 ms)
[ RUN ] Expr.TestMultiLogicalExprsOptimization
cost: 42us
cost: 15us
cost: 15us
cost: 14us
cost: 15us
cost: 15us
cost: 15us
cost: 15us
cost: 15us
cost: 15us
17
cost: 41us
cost: 39us
cost: 33us
cost: 33us
cost: 33us
cost: 33us
cost: 34us
cost: 41us
cost: 34us
cost: 34us
35
[ OK ] Expr.TestMultiLogicalExprsOptimization (6 ms)
[ RUN ] Expr.TestExprs
start test
3cost: 20us
start test
10cost: 2us
start test
20cost: 2us
start test
30cost: 2us
start test
50cost: 4us
start test
100cost: 8us
start test
200cost: 15us
[ OK ] Expr.TestExprs (8 ms)
[ RUN ] Expr.TestUnaryBenchTest
start test type:2
cost: 55.7us
start test type:3
cost: 79.8us
start test type:4
cost: 177.6us
start test type:5
cost: 337.2us
start test type:10
cost: 16.9us
start test type:11
cost: 15.7us
[ OK ] Expr.TestUnaryBenchTest (1140 ms)
[ RUN ] Expr.TestBinaryRangeBenchTest
start test type:2
cost: 57.1us
start test type:3
cost: 87us
start test type:4
cost: 177.5us
start test type:5
cost: 342.7us
start test type:10
cost: 17.9us
start test type:11
cost: 16.7us
[ OK ] Expr.TestBinaryRangeBenchTest (1152 ms)
[ RUN ] Expr.TestLogicalUnaryBenchTest
start test type:2
cost: 34.58us
start test type:3
cost: 68.86us
start test type:4
cost: 151.38us
start test type:5
cost: 286.8us
start test type:10
cost: 16.54us
start test type:11
cost: 16.7us
[ OK ] Expr.TestLogicalUnaryBenchTest (1165 ms)
[ RUN ] Expr.TestBinaryLogicalBenchTest
start test type:2
cost: 20us
start test type:3
cost: 17.1us
start test type:4
cost: 154.12us
start test type:5
cost: 286.1us
start test type:10
cost: 19.6us
start test type:11
cost: 19.24us
[ OK ] Expr.TestBinaryLogicalBenchTest (1188 ms)
[ RUN ] Expr.TestBinaryArithOpEvalRangeBenchExpr
start test type:2
cost: 125.7us
start test type:3
cost: 111.34us
start test type:4
cost: 148.02us
start test type:5
cost: 306.7us
start test type:10
cost: 149.3us
start test type:11
cost: 282.94us
[ OK ] Expr.TestBinaryArithOpEvalRangeBenchExpr (1221 ms)
[ RUN ] Expr.TestCompareExprBenchTest
start test type:2
cost: 89us
start test type:3
cost: 79us
start test type:4
cost: 323us
start test type:5
cost: 629us
start test type:10
cost: 313us
start test type:11
cost: 591us
[ OK ] Expr.TestCompareExprBenchTest (1228 ms)
[ RUN ] Expr.TestRefactorExprs
start test
3cost: 874us
start test
10cost: 611us
start test
20cost: 290us
start test
30cost: 294us
start test
50cost: 272us
start test
100cost: 278us
start test
200cost: 279us
[ OK ] Expr.TestRefactorExprs (1149 ms)
```
Signed-off-by: Alexandr Guzhva <alexanderguzhva@gmail.com>
2024-03-24 21:49:07 +08:00
Patrick Weizhi Xu
982dd2834b
enhance: add materialized view search info ( #30888 )
...
issue: #29892
This PR
1. Pass Materialized View (MV) search information obtained from the
expression parsing planning procedure to Knowhere. It only performs when
MV is enabled and the partition key is involved in the expression. The
search information includes:
1. Touched field_id and the count of related categories in the
expression. E.g., `color == red && color == blue` yields `field_id ->
2`.
2. Whether the expression only includes AND (&&) logical operator,
default `true`.
3. Whether the expression has NOT (!) operator, default `false`.
4. Store if turning on MV on the proxy to eliminate reading from
paramtable for every search request.
5. Renames to MV.
## Rebuttals
1. Did not write in `ExtractInfoPlanNodeVisitor` since the new scalar
framework was introduced and this part might be removed in the future.
2. Currently only interested in `==` and `in` expression, `string` data
type, anything else is a bonus.
3. Leave handling expressions like `F == A || F == A` for future works
of the optimizer.
## Detailed MV Info
![image](https://github.com/milvus-io/milvus/assets/6563846/b27c08a0-9fd3-4474-8897-30a3d6d6b36f )
Signed-off-by: Patrick Weizhi Xu <weizhi.xu@zilliz.com>
2024-03-21 11:19:07 +08:00
groot
c81909bfab
enhance: Support MinIO TLS connection ( #31311 )
...
issue: https://github.com/milvus-io/milvus/issues/30709
pr: #31292
Signed-off-by: yhmo <yihua.mo@zilliz.com>
Co-authored-by: Chen Rao <chenrao317328@163.com>
2024-03-21 11:15:20 +08:00
zhagnlu
cf5109ec17
fix: fix mmap failed when string field all value is empty ( #31406 )
...
#31162
Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-03-21 10:41:07 +08:00
Bingyi Sun
66d679ecbb
fix: clear binlog files in CleanData ( #31039 )
...
issue: https://github.com/milvus-io/milvus/issues/31042
Signed-off-by: sunby <sunbingyi1992@gmail.com>
2024-03-20 11:11:07 +08:00
gcmutator
6edd06083f
chore: remove repetitive words ( #31153 )
...
Signed-off-by: gcmutator <329964069@qq.com>
2024-03-20 10:17:07 +08:00
foxspy
b35ecebcc3
enhance: Update Knowhere version ( #31392 )
...
/kind branch-feature
Signed-off-by: xianliang.li <xianliang.li@zilliz.com>
2024-03-19 15:11:07 +08:00
sammy.huang
d7727dd087
enhance: fetch simdjson directly in the format of targz ( #31369 )
...
Signed-off-by: Liang Huang <sammy.huang@zilliz.com>
2024-03-18 18:55:11 +08:00
foxspy
1c930e560c
enhance: Update Knowhere version ( #31312 )
...
/kind branch-feature
Signed-off-by: xianliang.li <xianliang.li@zilliz.com>
2024-03-18 11:29:04 +08:00
Gao
038c570ef3
enhance: upgrade folly to run on arm ( #31284 )
...
Signed-off-by: chasingegg <chao.gao@zilliz.com>
2024-03-15 15:39:03 +08:00
Chun Han
6939ad15f2
fix:possible out-of-bound due to groupby when reduing( #30711 ) ( #31200 )
...
related: #30711
Signed-off-by: MrPresent-Han <chun.han@zilliz.com>
2024-03-14 13:07:03 +08:00
Buqian Zheng
7fc3094a42
fix: fix growing index data race and properly handle build error ( #31170 )
...
issue: https://github.com/milvus-io/milvus/issues/31169
also properly handling index build error by re-create a new index so
that nothing will be left in the previous failed index build attempt.
Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2024-03-13 20:19:04 +08:00
Buqian Zheng
96cfae55a5
feat: [Sparse Float Vector] segcore to support sparse vector search and get raw vector by id ( #30629 )
...
This PR adds the ability to search/get sparse float vectors in segcore,
and added unit tests by modifying lots of existing tests into
parameterized ones.
https://github.com/milvus-io/milvus/issues/29419
Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2024-03-12 09:16:30 -07:00
zhagnlu
c8b54f321a
fix:restrict pk in [...] optimization situations ( #31184 )
...
#31154
Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-03-12 14:49:03 +08:00
cai.zhang
6a83f16871
feat: Support for multiple forms of JSON ( #31052 )
...
issue: #31051
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-03-11 19:55:02 +08:00
Buqian Zheng
070dfc77bf
feat: [Sparse Float Vector] segcore basics and index building ( #30357 )
...
This commit adds sparse float vector support to segcore with the
following:
1. data type enum declarations
2. Adds corresponding data structures for handling sparse float vectors
in various scenarios, including:
* FieldData as a bridge between the binlog and the in memory data
structures
* mmap::Column as the in memory representation of a sparse float vector
column of a sealed segment;
* ConcurrentVector as the in memory representation of a sparse float
vector of a growing segment which supports inserts.
3. Adds logic in payload reader/writer to serialize/deserialize from/to
binlog
4. Adds the ability to allow the index node to build sparse float vector
index
5. Adds the ability to allow the query node to build growing index for
growing segment and temp index for sealed segment without index built
This commit also includes some code cleanness, comment improvement, and
some unit tests for sparse vector.
https://github.com/milvus-io/milvus/issues/29419
Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2024-03-11 14:45:02 +08:00
Cai Yudong
a99143dd52
fix: Save traceID and spanID as hex string into search config ( #31071 )
...
Issue: #30961
Signed-off-by: Yudong Cai <yudong.cai@zilliz.com>
2024-03-11 14:21:01 +08:00
sre-ci-robot
53af6d8c59
[automated] Update Knowhere Commit ( #31151 )
...
Update Knowhere Commit
Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2024-03-09 01:55:02 +08:00
Cai Yudong
122981aeb9
fix: Disable knowhere trace as a quick fix ( #31055 )
...
Issue: #30961
Signed-off-by: Yudong Cai <yudong.cai@zilliz.com>
2024-03-08 15:27:01 +08:00