Commit Graph

94 Commits

Author SHA1 Message Date
zhagnlu
804dd5409a
enhance: mark duplicated pk as deleted (#34586)
fix #34247

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-07-16 14:25:39 +08:00
zhagnlu
3030e4625e
enhance: refactor variable column to reduce memory cost (#33875)
#33874

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-06-30 20:16:06 +08:00
cqy123456
32f685ff12
enhance: growing segment support mmap (#32633)
issue: https://github.com/milvus-io/milvus/issues/32984

Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2024-06-18 14:42:00 +08:00
Buqian Zheng
8cb350598c
enhance: Improve GetVectorById of Sparse Float Vector (#33209)
issue: #29419

* sparse float vector to support raw data mmap

For get vector from chunk cache, I added a unit test but marking it as
skipped due to a known issue. I have tested it locally.

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2024-06-12 10:09:55 +08:00
chyezh
e19d17076f
fix: delete may lost when enable lru cache, some field should be reset when ReleaseData (#32012)
issue: #30361

- Delete may be lost when segment is not data-loaded status in lru
cache. skip filtering to fix it.

- `stats_` and `variable_fields_avg_size_` should be reset when
`ReleaseData`

- Remove repeat load delta log operation in lru.

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2024-04-16 11:17:20 +08:00
Buqian Zheng
96cfae55a5
feat: [Sparse Float Vector] segcore to support sparse vector search and get raw vector by id (#30629)
This PR adds the ability to search/get sparse float vectors in segcore,
and added unit tests by modifying lots of existing tests into
parameterized ones.

https://github.com/milvus-io/milvus/issues/29419

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2024-03-12 09:16:30 -07:00
Cai Yudong
8a219e0102
feat: Support knowhere trace using OpenTelemetry (#30750)
Issue: #21508

Signed-off-by: Yudong Cai <yudong.cai@zilliz.com>
2024-02-28 12:29:00 +08:00
yah01
57397b1307
enhance: add new LRU cache impl (#30360)
- remove  the unused LRU cache
- add new LRU cache impl which wraps github.com/karlseguin/ccache

related #30361

---------

Signed-off-by: yah01 <yang.cen@zilliz.com>
2024-02-27 20:58:40 +08:00
MrPresent-Han
77eb6defb1
feat: support groupby on growing and non-indexed sealed egment(#30307) (#30644)
related: #30308

Signed-off-by: MrPresent-Han <chun.han@zilliz.com>
2024-02-21 14:02:53 +08:00
zhagnlu
e8a6f1ea2b
fix: erase pk empty check when pk index replace raw data (#30432)
#30350

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-02-07 14:56:47 +08:00
yihao.dai
c02fb64ad6
enhance: Allows proactive warming up of chunk cache (#30182)
Allows proactive warming up of chunk cache. Original vector data will be
asynchronously loaded into the chunk cache during the load process. It
has the potential to significantly reduce query/search latency for a
certain duration after the load, albeit with a concurrent increase in
disk usage.

issue: https://github.com/milvus-io/milvus/issues/30181

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-01-25 19:55:39 +08:00
Xu Tong
e429965f32
Add float16 approve for multi-type part (#28427)
issue:https://github.com/milvus-io/milvus/issues/22837

Add bfloat16 vector, add the index part of float16 vector.

Signed-off-by: Writer-X <1256866856@qq.com>
2024-01-11 15:48:51 +08:00
zhenshan.cao
60e88fb833
fix: Restore the MVCC functionality. (#29749)
When the TimeTravel functionality was previously removed, it
inadvertently affected the MVCC functionality within the system. This PR
aims to reintroduce the internal MVCC functionality as follows:

1. Add MvccTimestamp to the requests of Search/Query and the results of
Search internally.
2. When the delegator receives a Query/Search request and there is no
MVCC timestamp set in the request, set the delegator's current tsafe as
the MVCC timestamp of the request. If the request already has an MVCC
timestamp, do not modify it.
3. When the Proxy handles Search and triggers the second phase ReQuery,
divide the ReQuery into different shards and pass the MVCC timestamp to
the corresponding Query requests.

issue: #29656

Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2024-01-09 11:38:48 +08:00
MrPresent-Han
9e2e7157e9
feat: support search_group_by for milvus(#25324) (#28983)
related: #25324

Search GroupBy function, used to aggregate result entities based on a
specific scalar column.
several points to mention:

1. Temporarliy, the whole groupby is implemented separated from
iterative expr framework **for the first period**
2. In the long term, the groupBy operation will be incorporated into the
iterative expr framework:https://github.com/milvus-io/milvus/pull/28166
3. This pr includes some unrelated mocked interface regarding alterIndex
due to some unworth-to-mention reasons. All these un-associated content
will be removed before the final pr is merged. This version of pr is
only for review
4. All other related details were commented in the files comparison

Signed-off-by: MrPresent-Han <chun.han@zilliz.com>
2024-01-05 15:50:47 +08:00
Gao
9b52cb6417
enhance: improve reducing results when many segments are filtered (#29073)
Do not fill the invalid ids for the empty results, it will incur useless
memory overhead and reduce overhead when nq and topk is large.

---------

Signed-off-by: chasingegg <chao.gao@zilliz.com>
2023-12-20 12:56:42 +08:00
zhagnlu
a602171d06
enhance: Refactor runtime and expr framework (#28166)
#28165

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2023-12-18 12:04:42 +08:00
yah01
f7d2ab6677
enhance: reduce 1x copy for variable length field while retrieving (#28345)
- Reduce 1x copy for varchar/string/JSON/array types while retrieving
- Reduce 1x copy for int8/int16 while retrieving

Signed-off-by: yah01 <yah2er0ne@outlook.com>
2023-11-15 18:08:20 +08:00
MrPresent-Han
836f300536
support skip-index based on chunk-metrics to accelerate expr filter(#27925) (#28297)
related: #27925

Signed-off-by: MrPresent-Han <chun.han@zilliz.com>
2023-11-15 11:20:19 +08:00
Xu Tong
8ec85f5f4c
Add template for VectorMemIndex (#28324)
Signed-off-by: Writer-X <1256866856@qq.com>
2023-11-11 13:20:22 +08:00
yah01
dc89730a50
Support collection-level mmap control (#26901)
Signed-off-by: yah01 <yah2er0ne@outlook.com>
2023-11-02 23:52:16 +08:00
Enwei Jiao
8ae9c947ae
Use OpenDAL to access object store (#25642)
Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>
2023-11-01 09:00:14 +08:00
yihao.dai
106c17f304
Make read ahead policy in ChunkCache configurable (#27291)
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2023-09-28 15:47:27 +08:00
foxspy
5db4a0489e
dynamic index version control (#27335)
Co-authored-by: longjiquan <jiquan.long@zilliz.com>
2023-09-25 21:39:27 +08:00
foxspy
370b6fde58
milvus support multi index engine (#27178)
Co-authored-by: longjiquan <jiquan.long@zilliz.com>
2023-09-22 09:59:26 +08:00
yah01
93e2eb78c9
Delete only if primary keys exist (#25292)
Signed-off-by: yah01 <yang.cen@zilliz.com>
2023-09-20 19:03:25 +08:00
cai.zhang
a362bb1457
Support array datatype (#26369)
Signed-off-by: cai.zhang <cai.zhang@zilliz.com>
2023-09-19 14:23:23 +08:00
yihao.dai
bb6711f28c
Add ChunkCache: support get vector from storage (#26142)
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2023-09-15 10:21:20 +08:00
Enwei Jiao
ca1349708b
Remove time travel ralted testcase (#26119)
Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>
2023-08-10 18:53:17 +08:00
yah01
53c3bf053e
Fix unstable sealed segment bruteforce unittest (#25867)
Signed-off-by: yah01 <yah2er0ne@outlook.com>
2023-07-25 09:05:00 +08:00
yah01
dd5f896dc8
Load batch by batch (#25212)
This will significantly reduce the memory usage while loading
- 1x memory usage and MBs overhead for buffer (memory mode)
- only MBs overhead for buffer (mmap mode)

Signed-off-by: yah01 <yang.cen@zilliz.com>
2023-07-06 13:58:27 +08:00
Enwei Jiao
816158e4af
Remove outdated searchplan (#25282)
Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>
2023-07-04 18:30:25 +08:00
xige-16
04082b3de2
Migrate the ability to upload and download binlog to cpp (#22984)
Signed-off-by: xige-16 <xi.ge@zilliz.com>
2023-06-25 14:38:44 +08:00
PowderLi
3f4356df10
fix the spelling of field (#25008)
Signed-off-by: PowderLi <min.li@zilliz.com>
2023-06-21 14:00:42 +08:00
yah01
a413842e38
Fix deleted data is still visible (#24849)
Signed-off-by: yah01 <yang.cen@zilliz.com>
2023-06-16 17:16:41 +08:00
foxspy
6f4ed517de
add growing segment index (#23615)
Signed-off-by: xianliang <xianliang.li@zilliz.com>
2023-04-26 10:14:41 +08:00
yihao.dai
092d743917
Add support for getting vectors by ids (#23450)
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2023-04-23 09:00:32 +08:00
yah01
546080dcdd
Support to retrieve json (#23563)
Signed-off-by: yah01 <yang.cen@zilliz.com>
2023-04-21 11:46:32 +08:00
Cai Yudong
5f4673fd16
Optimize unittest to save runtime (#23248)
Signed-off-by: Yudong Cai <yudong.cai@zilliz.com>
2023-04-07 14:20:29 +08:00
yah01
081572d31c
Refactor QueryNode (#21625)
Signed-off-by: yah01 <yang.cen@zilliz.com>
Co-authored-by: Congqi Xia <congqi.xia@zilliz.com>
Co-authored-by: aoiasd <zhicheng.yue@zilliz.com>
2023-03-27 00:42:00 +08:00
Jiquan Long
8139106b51
Feat: count entities by expression (#22765)
Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2023-03-16 19:31:55 +08:00
Jiquan Long
a36fefb009
Fix cpplint (#22657)
Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2023-03-10 09:47:54 +08:00
yah01
7478e44911
Support using mmap to load data (#22052)
Signed-off-by: yah01 <yang.cen@zilliz.com>
2023-03-01 18:07:49 +08:00
smellthemoon
9e0ec15436
Support range search (#21652)
Signed-off-by: smellthemoon <xinguo.li@zilliz.com>
Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: jaime <yun.zhang@zilliz.com>
2023-02-21 09:48:32 +08:00
presburger
9950cacd10
support knowhere 2.0 (#21857)
Signed-off-by: Yusheng.Ma <Yusheng.Ma@zilliz.com>
2023-02-10 14:24:32 +08:00
Jiquan Long
d7156812c1
Try using ASAN in ci ut (#21089)
Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2022-12-29 15:29:31 +08:00
xige-16
8c9c1672ae
Assign different storage config for indexes (#19517)
Signed-off-by: xige-16 <xi.ge@zilliz.com>

Signed-off-by: xige-16 <xi.ge@zilliz.com>
2022-10-14 14:45:23 +08:00
xige-16
428840178c
Support diskann index for vector field (#19093)
Signed-off-by: xige-16 <xi.ge@zilliz.com>

Signed-off-by: xige-16 <xi.ge@zilliz.com>
2022-09-21 20:16:51 +08:00
zhenshan.cao
a287a2b3fd
Return empty result in advance if all data filtered out (#18329) (#18438)
Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2022-07-28 12:36:30 +08:00
Jeng.Gwan
638f6c36e9
Support to get real row count of segment (#18115)
Signed-off-by: xaxys <zheng.guan@zilliz.com>
2022-07-18 09:58:28 +08:00
zhagnlu
257da153ce
Fix core dump when nq has no topk result (#17923) (#18051)
Signed-off-by: zhagnlu <lu.zhang@zilliz.com>

Co-authored-by: zhagnlu <lu.zhang@zilliz.com>
2022-07-05 19:48:20 +08:00