test: There are too many test cases for bulkinsert+partition_key. Each
case creates 10 bulkinsert tasks to import a file with 100~200 rows. The
default num_partitions is 64 for partition_key. So, each task will
generate 64 tiny segments. There are 10 cases, each case 10 tasks, each
task 64 tiny segment, totally there are 6400 tiny segments generated.
And all these segment row count is less than 1024, no need to build
index, and take part in compaction. There will be lots of compaction
tasks generated. It costs too much time to process these compaction
tasks. Eventually, some cases are timeout after waiting 5 minutes for
their segments to be ready and cases fail.
Specifying the num_partitions to a small value can avoid this problem.
```
[2023-11-21T03:41:16.187Z] testcases/test_bulk_insert.py::TestBulkInsert::test_partition_key_on_json_file[int_scalar-True-True] PASSED [ 54%]
[2023-11-21T03:41:42.796Z] testcases/test_bulk_insert.py::TestBulkInsert::test_partition_key_on_json_file[int_scalar-False-True] PASSED [ 57%]
[2023-11-21T03:42:04.694Z] testcases/test_bulk_insert.py::TestBulkInsert::test_partition_key_on_json_file[string_scalar-True-True] PASSED [ 60%]
[2023-11-21T03:42:31.205Z] testcases/test_bulk_insert.py::TestBulkInsert::test_partition_key_on_json_file[string_scalar-False-True] PASSED [ 63%]
[2023-11-21T03:43:38.876Z] testcases/test_bulk_insert.py::TestBulkInsert::test_partition_key_on_multi_numpy_files[10-150-13-True] XPASS [ 66%]
[2023-11-21T03:49:00.357Z] testcases/test_bulk_insert.py::TestBulkInsert::test_partition_key_on_multi_numpy_files[10-150-13-False] XFAIL [ 69%]
[2023-11-21T03:53:51.811Z] testcases/test_bulk_insert.py::TestBulkInsert::test_partition_key_on_csv_file[int_scalar-True] FAILED [ 72%]
[2023-11-21T03:58:58.283Z] testcases/test_bulk_insert.py::TestBulkInsert::test_partition_key_on_csv_file[int_scalar-False] FAILED [ 75%]
[2023-11-21T04:02:04.696Z] testcases/test_bulk_insert.py::TestBulkInsert::test_partition_key_on_csv_file[string_scalar-True] PASSED [ 78%]
[2023-11-21T04:02:26.608Z] testcases/test_bulk_insert.py::TestBulkInsert::test_partition_key_on_csv_file[string_scalar-False] PASSED [ 81%]
```
Signed-off-by: yhmo <yihua.mo@zilliz.com>
see also: https://github.com/milvus-io/milvus/issues/28509
Currently Minio latency monitoring for get operation only collects the
duration of getting object (which just returns an io.Reader and does not
really read from minio), this pr will correct this behavior.
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
Related to #28736#28748
See also #27675
Previous PR: #28646
This PR fixes `SegmentNotFound` issue when compaction happens multiple
times and the buffer of first generation segment is sync due to stale
policy
Now the `CompactSegments` API of metacache shall update the compactTo
field of segmentInfo if the compactTo segment is also compacted to keep
the bloodline clean
Also, add the `CompactedSegment` SyncPolicy to sync the compacted
segment asap to keep metacache clean
Now the `SyncPolicy` is an interface instead of a function type so that
when it selects some segments to sync, we colud log the reason and
target segment
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
issue: #28622
query node with delegator will has more rows than other query node due
to delgator loads all growing rows.
This PR enable the balance segment which based on the num of growing
rows in leader view.
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
fix macOS compilation fail, related #28715
1) install_deps.sh missed to install some dependencies
2) llvm can not be found without a soft link
Signed-off-by: xiaofan luan <xiaofanluan@xiaofandeMacBook-Pro.local>
Co-authored-by: xiaofan luan <xiaofanluan@xiaofandeMacBook-Pro.local>
See also #27675
When L0 segment contains only delta data, merged statslog shall be
skiped when performing sync task
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
See also #27675
Compacted segment info shall be removed after all buffer belongs to it
is sync-ed.
This PR add the cleanup function after triggerSyncTask logic:
- The buffer is stable and protected by mutex
- Cleanup fetches compacted & non-sync segment
- Remove segment info only there is no buffered maintained in manager
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
remove some unnecessary assignments, for the reason that
commonpbutil.NewMsgBase has default value.
Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
issue: #28683
to avoid downloading installation packages in CI workload install vcpkg
and install some packages in advance
Signed-off-by: PowderLi <min.li@zilliz.com>
it's easy to trigger heartbeat timeout after 100ms when standalone cpu
usage reach 100%.
This PR increase the heartbeat timeout param to 2000ms
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
See also #28022#28034
The load segment may reaches before watch dml channel, so the index meta
may be empty as well
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
See also #28660
This pr add request timeout config item for etcd kv request timeout
Sync the default timeout value to same value for etcdKV & tikv config
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
issue: #28365
Fix bug for parsing error when a string enclosed in single quotes in an
expression contains multiple double quotes.
such as:
```
expr = "tag == '\"blue\"'"
```
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
before this, Milvus use container/system's memory info to get the memory
usage, which could be inaccurate.
we allocates the memory by private anon mmap,
then `rss - shared` would be the accurate memory usage
resolve#28553
---------
Signed-off-by: yah01 <yah2er0ne@outlook.com>
See also #28628
Previous compaction task blocked the segment sync task and may block the
flowgraph when sync task is generated by auto sync policy This
`BlockAll` call will block forever and cause whole fg stuck
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>