issue: https://github.com/milvus-io/milvus/issues/36182
* improved `Column.h` to make the code much more readable and
maintainable, and added detailed comments.
* fixed an issue where `ArrayColumn::NumRows()` always returns 0 when
the mmap backing storage is a file.
* removed unused `ColumnBase` constructors and unnecessary members so we
don't get confused.
* Updated `test_chunk_cache.cpp` to make the tests parameterized: to
test both mmap enabled and disabled. Added sparse field in the test to
add coverage.
* re-enabled test `Sealed::GetSparseVectorFromChunkCache`.
* But 2 other disabled tests `Sealed::WarmupChunkCache` and
`Sealed::GetVectorFromChunkCache` remain disabled, there seems to be
errors. @bigsheeper PTAL.
---------
Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
Related to #35303
This PR utilizes pk index in segment to exclude non-hit delete record
during load delete records. This ability is crucial when l0/delete
forward policy only replies on segment itself(without BF filtering).
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Related to #35941
Previous PR: #36034
This patch makes the switch branching logic correct and make the unit
test work for cases which does not select the whole dataset.
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Related to #35941
Previous PR: #35943
This PR make `Trie` index using `MARISA_LABEL_ORDER`, which make
predictive search iterating in lexicographic order.
When trie index is build in label order, lexicographc could be utilized
accelerating `Range` operations.
However according to the official document, using `MARISA_LABEL_ORDER`
will make "exact match lookup, common prefix search, and predictive
search" slower.
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
fix not append valid data when transfer to insert record and add a tiny
check when in groupBy field.
#35924
Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
Related to #35941
For marisa trie `predictive_search` default behavior, it value iterated
is not in lexicographic order.
This PR is a brute force fix to make range operator returns correct
values.
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Related to #35927
There are serveral issue this PR addresses:
- Use `ResetTraceConfig` method instead init one in update event handler
- Implement dynamic stats.Handler to receive tracing config update event
- Update `enable_trace` flag when `ResetTraceConfig` is invoked
- Change `enable_trace` to `std::atomic<bool>` in case of data race
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
issue: #33744
This PR includes the following changes:
1. Added a new task type to the task scheduler in datacoord: stats task,
which sorts segments by primary key.
2. Implemented segment sorting in indexnode.
3. Added a new field `FieldStatsLog` to SegmentInfo to store token index
information.
---------
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
Related to #35578
Previously int16/int8 bitmap index may read int32 array as int16, which
may cause build index with half of the data(if array is full) and half
zeros. This causes BITMAP index lost information.
This PR matches int8_t & int16_t while `get_data` when building index.
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>