Knowhere introduced FAISS-based HNSQ Flat/SQ/PQ/PRQ. This PR adds param
check in Milvus side.
This PR actually do nothing but let the check pass, since:
1. All indexes param check will be moved to Knowhere recently. @foxspy
2. The index name will be changed from `FAISS_HNSW_xx` to `HNSW_xx`
after QA test. @alexanderguzhva
Signed-off-by: Li Liu <li.liu@zilliz.com>
issue: https://github.com/milvus-io/milvus/issues/35853
* BM25 Function now takes no params, k1, b should be passed via index
params
* support BM25 full text search when metric type is not present in
search request
* add more strict validation with functions at collection creation time
Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
This PR splits sealed segment to chunked data to avoid unnecessary
memory copy and save memory usage when loading segments so that loading
can be accelerated.
To support rollback to previous version, we add an option
`multipleChunkedEnable` which is false by default.
Signed-off-by: sunby <sunbingyi1992@gmail.com>
issue: #35922
add an enable_tokenizer param to varchar field: must be set to true so
that a varchar field can enable_match or used as input of BM25 function
---------
Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
See #36550
This PR made 2 changes:
1. Introducing a prioritization mechanism, if
`dataCoord.compaction.taskPrioritizer` is set to `level`, compaction
tasks are always executed as the priority of L0>Mix>Clustering
2. `dataCoord.compaction.maxParallelTaskNum` now controls the
parallelism of executing tasks, not the task number of queue +
executing.
---------
Signed-off-by: Ted Xu <ted.xu@zilliz.com>
5 percent of free memory is too less for l0 compaction. This pr will
raise it to 50 percent.
See also: #36614
Signed-off-by: yangxuan <xuan.yang@zilliz.com>
1. Optimize import scheduling strategic:
a. Revise slot weights, calculating them based on the number of files
and segments for both import and pre-import tasks.
b. Ensure that the DN executes tasks in ascending order of task ID.
2. Add time cost metric and log.
issue: https://github.com/milvus-io/milvus/issues/36600,
https://github.com/milvus-io/milvus/issues/36518
---------
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
Native support for Google cloud storage using the Google Cloud Storage
libraries. Authentication is performed using GCS service account
credentials JSON.
Currently, Milvus supports Google Cloud Storage using S3-compatible APIs
via the AWS SDK. This approach has the following limitations:
1. Overhead: Translating requests between S3-compatible APIs and GCS can
introduce additional overhead.
2. Compatibility Limitations: Some features of the original S3 API may
not fully translate or work as expected with GCS.
To address these limitations, This enhancement is needed.
Related Issue: #36212
issue: #35821
After collection loaded, if we need to increase/decrease collection's
replica, we need to release and load it again.
milvus offers 4 solution to update loaded collection's replica, this PR
aims to dynamic change the replica number without release, and after
replica number changed, milvus will execute load replica or release
replica in async, and the replica loaded status can be checked by
getReplicas API.
Notice that if set too much replicas than querynode can afford,the new
replica won't be loaded successfully until enough querynode joins.
---------
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
issue: #35859
This PR introduce two new param: toleranceFactor and checkRequestNum,
after every checkRequestNum request has been assigned, try to compute
querynode's workload score.
if the diff is less than the toleranceFactor, replica selection policy
will fallback to round_robin, which reduce the average cost to about
500ns.
if the diff is larger than the toleranceFactor, replica selection policy
will compute querynode's score to select the target node with smallest
score in every assigment.
---------
Signed-off-by: Wei Liu <wei.liu@zilliz.com>