Commit Graph

1357 Commits

Author SHA1 Message Date
sre-ci-robot
ebbe32df9a
[automated] Update Knowhere Commit (#30515)
Update Knowhere Commit
Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2024-02-05 01:32:44 +08:00
Jiquan Long
a587450e56
enhance: [skip-e2e] disable asan (#30498)
fix: #30511 
/kind improvement

---------

Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2024-02-04 21:25:05 +08:00
sre-ci-robot
20c9cfc587
[automated] Update Knowhere Commit (#30487)
Update Knowhere Commit
Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2024-02-04 01:23:04 +08:00
Jiquan Long
e549148a19
enhance: full-support for wildcard pattern matching (#30288)
issue: #29988 
This pr adds full-support for wildcard pattern matching from end to end.
Before this pr, the users can only use prefix match in their expression,
for example, "like 'prefix%'". With this pr, more flexible syntax can be
combined.

To do so, this pr makes these changes:
- 1. support regex query both on index and raw data;
- 2. translate the pattern matching to regex query, so that it can be
handled by the regex query logic;
- 3. loose the limit of the expression parsing, which allows general
pattern matching syntax;

With the support of regex query in segcore backend, we can also add
mysql-like `REGEXP` syntax later easily.

---------

Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2024-02-01 12:37:04 +08:00
PowderLi
5cf9bb236e
enhance: restful support import jobs (#30343)
issue: #28521 #29732

include
1. list collection's import jobs
2. create a new import job
3. get the progress of an import job

fix:
1. mix the order of dbName & collectionName #29728
2. trace log keep the same as v1
3. support traceID
4. azure precheck, blob name cannot end with / #29703

---------

Signed-off-by: PowderLi <min.li@zilliz.com>
2024-01-31 17:57:04 +08:00
yah01
878c4c9463
enhance: limit the max pool size to 16 (#30371)
according to our benchmark, concurrency level 16 is enough to fully
utilize the object storage network bandwidth

Signed-off-by: yah01 <yang.cen@zilliz.com>
2024-01-31 14:13:06 +08:00
cqy123456
74cfba0249
enhance:limit binlog index rows num (#30173)
issue: https://github.com/milvus-io/milvus/issues/27678
also relate issue: https://github.com/milvus-io/milvus/issues/30065

Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2024-01-29 19:49:02 +08:00
sre-ci-robot
0542a0e7dc
[automated] Update Knowhere Commit (#30332)
Update Knowhere Commit
Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2024-01-29 01:05:01 +08:00
zhagnlu
aeb1e36f00
enhance: change plan desc log from info to debug (#30304)
#30172

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-01-28 16:04:38 +08:00
xige-16
e9fdd2475d
fix: fix searchPlan metricType modified concurrently (#30227)
issue: #30225
/kind bug
Signed-off-by: xige-16 <xi.ge@zilliz.com>

---------

Signed-off-by: xige-16 <xi.ge@zilliz.com>
2024-01-26 14:03:09 +08:00
MrPresent-Han
116d0f20b8
fix: groupby bug for ut (#30272)
related: #29965

Signed-off-by: MrPresent-Han <chun.han@zilliz.com>
2024-01-25 20:57:00 +08:00
yihao.dai
c02fb64ad6
enhance: Allows proactive warming up of chunk cache (#30182)
Allows proactive warming up of chunk cache. Original vector data will be
asynchronously loaded into the chunk cache during the load process. It
has the potential to significantly reduce query/search latency for a
certain duration after the load, albeit with a concurrent increase in
disk usage.

issue: https://github.com/milvus-io/milvus/issues/30181

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-01-25 19:55:39 +08:00
yah01
a27c0e86fd
enhance: reduce many I/O operations while loading disk index (#30189)
before this, every time writting the index chunk data into the disk,
there are 4 I/O operations:
- open the file
- seek to the offset
- write the data
- close the file

this optimized this to open only once and continiously write all data.

This also makes it concurrent to load the files from object storage

Signed-off-by: yah01 <yang.cen@zilliz.com>
2024-01-25 15:23:02 +08:00
zhagnlu
8c58d9af67
enhance: optimize marisa trie range search for performance (#30079)
#30078
#29986

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-01-25 10:07:00 +08:00
Patrick Weizhi Xu
0907d76253
enhance: pass partition key scalar info if enabled when build vector index (#29931)
issue: #29892 

Pass optional scalar IVF offsets to Cardinal

Signed-off-by: Patrick Weizhi Xu <weizhi.xu@zilliz.com>
2024-01-24 00:04:55 +08:00
cqy123456
42bb4e37e5
fix:diskann search crash when search list = 9999999999 (#30185)
issue: https://github.com/milvus-io/milvus/issues/29020
Json can't not pass a max_int32 value to int32_t, so let knowhere check
value range by itself.
After fix this, pymilvus will report:
pymilvus.exceptions.MilvusException: <MilvusException: (code=65535,
message=fail to search on QueryNode 6: worker(6) query failed: => failed
to search: arithmetic overflow: param search_list_size should be at most
2147483647)>

Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2024-01-23 14:46:55 +08:00
cai.zhang
6cf2f09b60
feat: Support tencent cloud object storage for milvus (#30163)
issue: #30162

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-01-23 11:28:56 +08:00
yah01
a77693aa19
enhance: convert the GetObject util to async (#30166)
This makes it much easier to use

Signed-off-by: yah01 <yang.cen@zilliz.com>
2024-01-22 19:20:57 +08:00
sre-ci-robot
e967949cc5
[automated] Update Knowhere Commit (#30120)
Update Knowhere Commit
Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2024-01-22 18:40:54 +08:00
MrPresent-Han
4436effdc3
enhance: support groupby based on scalar-index(#29965) (#30091)
related: #29965

Signed-off-by: MrPresent-Han <chun.han@zilliz.com>
2024-01-22 10:50:54 +08:00
xige-16
aee19dcd6b
enhance: Opt vector dimension mismatch error message (#29928)
issue: https://github.com/milvus-io/milvus/issues/29791
/kind improvement

Signed-off-by: xige-16 <xi.ge@zilliz.com>

---------

Signed-off-by: xige-16 <xi.ge@zilliz.com>
2024-01-19 17:52:54 +08:00
yah01
f542bdbf3c
enhance: calc the accurate mem size of segment (#30093)
this stats the real memory size of segment, also reduces the memory
usage in mmap mode
resolve #30095

Signed-off-by: yah01 <yang.cen@zilliz.com>
2024-01-19 12:32:53 +08:00
xige-16
fa7cf587b0
enhance: Opt metric type does not match error message (#29927)
issue: #29791 
/kind improvement
Signed-off-by: xige-16 <xi.ge@zilliz.com>

Signed-off-by: xige-16 <xi.ge@zilliz.com>
2024-01-17 20:25:03 +08:00
yah01
1185e4dcd5
fix: written file size is over the int32 range and raises error (#30057)
we sum the total data size in int32, which could lead to an overflow
error
related #30056

Signed-off-by: yah01 <yang.cen@zilliz.com>
2024-01-17 16:42:54 +08:00
Bingyi Sun
8030b90891
fix: correct file name when loading index (#29985)
issue: #29973

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2024-01-16 10:24:52 +08:00
MrPresent-Han
c31e68446e
enhance: refine groupby-performance (#29933)
related: #29844

Signed-off-by: MrPresent-Han <chun.han@zilliz.com>
2024-01-15 14:12:52 +08:00
chyezh
def717af55
fix: SealedIndexingEntry in SealedIndexingRecord may leak without smart pointer protect. (#29932)
may related issue: #29828

Signed-off-by: chyezh <ye.zhen@zilliz.com>
2024-01-14 10:28:51 +08:00
Bingyi Sun
e1258b8cad
feat: integrate storagev2 into loading segment (#29336)
issue: #29335

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2024-01-12 18:10:51 +08:00
yah01
f2e36db488
enhance: optimize the loading index performance (#29894)
this utilizes concurrent loading

Signed-off-by: yah01 <yang.cen@zilliz.com>
2024-01-12 17:44:51 +08:00
yah01
6c477ce3a7
enhance: optimize the loading strategy (#29910)
as we have the pool size limit so we don't need to limit the concurrency
manually

Signed-off-by: yah01 <yang.cen@zilliz.com>
2024-01-12 14:26:50 +08:00
yah01
aba2656e68
fix: missing field data after appending scalar index to loaded segment (#29912)
related #29843

Signed-off-by: yah01 <yang.cen@zilliz.com>
2024-01-12 14:04:54 +08:00
sre-ci-robot
4d11525f55
[automated] Update Knowhere Commit (#29904)
Update Knowhere Commit
Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2024-01-12 14:00:50 +08:00
Xu Tong
e429965f32
Add float16 approve for multi-type part (#28427)
issue:https://github.com/milvus-io/milvus/issues/22837

Add bfloat16 vector, add the index part of float16 vector.

Signed-off-by: Writer-X <1256866856@qq.com>
2024-01-11 15:48:51 +08:00
Jiquan Long
67ab5be15a
enhance: optimize search performance of inverted index (#29794)
issue: #29793 
Use `DocSetCollector` instead of `TopDocsCollector`, which will avoid
scoring and sorting.

---------

Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2024-01-11 11:12:49 +08:00
zhagnlu
5164d30287
fix: increase expr recursion depth to avoid parse failed (#29860)
#29759

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-01-11 10:26:50 +08:00
yah01
031243fee7
feat: support mmap for marisa trie (#29613)
this supports mmap for marisa trie index
related https://github.com/milvus-io/milvus/issues/21866

Signed-off-by: yah01 <yang.cen@zilliz.com>
2024-01-11 10:22:50 +08:00
congqixia
d6429933a7
enhance: make Load process traceable in querynode & segcore (#29858)
See also #29803

This PR:
- Add trace span for `LoadIndex` & `LoadFieldData` in segment loader
- Add `TraceCtx` parameter for `Index.Load` in segcore
- Add span for ReadFiles & Engine Load for Memory/Disk Vector index

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-01-10 21:58:51 +08:00
Cai Yudong
cb9d9ec0f0
enhance: Correct sampleFraction's type to float (#29810)
Signed-off-by: Yudong Cai <yudong.cai@zilliz.com>
2024-01-10 13:18:50 +08:00
Cai Yudong
600f6eff06
enhance: Upgrade gtest to 1.13.0 (#29805)
Signed-off-by: Yudong Cai <yudong.cai@zilliz.com>
2024-01-10 13:16:57 +08:00
zhagnlu
601a8b801b
fix: add move cursor function to physical expr (#29603)
#29570

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-01-09 17:08:48 +08:00
zhenshan.cao
60e88fb833
fix: Restore the MVCC functionality. (#29749)
When the TimeTravel functionality was previously removed, it
inadvertently affected the MVCC functionality within the system. This PR
aims to reintroduce the internal MVCC functionality as follows:

1. Add MvccTimestamp to the requests of Search/Query and the results of
Search internally.
2. When the delegator receives a Query/Search request and there is no
MVCC timestamp set in the request, set the delegator's current tsafe as
the MVCC timestamp of the request. If the request already has an MVCC
timestamp, do not modify it.
3. When the Proxy handles Search and triggers the second phase ReQuery,
divide the ReQuery into different shards and pass the MVCC timestamp to
the corresponding Query requests.

issue: #29656

Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2024-01-09 11:38:48 +08:00
xige-16
9702cef2b5
feat: Support multiple vector search (#29433)
issue #25639 

Signed-off-by: xige-16 <xi.ge@zilliz.com>

Signed-off-by: xige-16 <xi.ge@zilliz.com>
2024-01-08 15:34:48 +08:00
Jiquan Long
e9f3df3626
fix: inverted index file not found (#29695)
issue: https://github.com/milvus-io/milvus/issues/29654

---------

Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2024-01-07 20:26:49 +08:00
zhagnlu
d07197ab1a
enhance: add compare simd function (#29432)
#26137

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-01-07 20:20:57 +08:00
foxspy
271edc6669
fix: throw exception when upload file failed for DiskIndex (#29627)
related to : #29417 

cardinal indexes upload index files in `Serialize` interface, and throw
exception when the `Serialize` failed.

Signed-off-by: xianliang <xianliang.li@zilliz.com>
2024-01-07 20:03:13 +08:00
cai.zhang
5dc300c4a9
fix: Fix bug for pk index doesn't have raw data (#29711)
issue: #29697

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-01-07 19:36:48 +08:00
MrPresent-Han
9e2e7157e9
feat: support search_group_by for milvus(#25324) (#28983)
related: #25324

Search GroupBy function, used to aggregate result entities based on a
specific scalar column.
several points to mention:

1. Temporarliy, the whole groupby is implemented separated from
iterative expr framework **for the first period**
2. In the long term, the groupBy operation will be incorporated into the
iterative expr framework:https://github.com/milvus-io/milvus/pull/28166
3. This pr includes some unrelated mocked interface regarding alterIndex
due to some unworth-to-mention reasons. All these un-associated content
will be removed before the final pr is merged. This version of pr is
only for review
4. All other related details were commented in the files comparison

Signed-off-by: MrPresent-Han <chun.han@zilliz.com>
2024-01-05 15:50:47 +08:00
cqy123456
22bb84fa9d
feat:add new gpu index:GPU_BRUTE_FORCE and limit gpu index metric type (#29590)
issue: https://github.com/milvus-io/milvus/issues/29230
this pr do these things:
1. add gpu brute force;
2. limit gpu index only support l2 / ip;

Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2024-01-05 15:24:48 +08:00
PowderLi
c8db36a63a
enhance: get a blob to check object storage config (#29703)
issue: #29672
the storage account need privileges of actions
`Microsoft.Storage/storageAccounts/blobServices/containers/blobs/*` at
least

Signed-off-by: PowderLi <min.li@zilliz.com>
2024-01-05 14:50:46 +08:00
yah01
0ae90443ba
enhance: fill missed info for segcore error (#29610)
- fill missed error info
- format the error message directly

Signed-off-by: yah01 <yang.cen@zilliz.com>
2024-01-04 17:54:46 +08:00