the old version Knowhere would copy the index data while loading, we
need to consider this to avoid OOM.
Knowhere provides a util function to indicate whether it will load the
index with disk, if not, we need to double the memory usage prediction
for index data
Signed-off-by: yah01 <yang.cen@zilliz.com>
Related to #30191
When loading segment, segment loader shall check memory usage for
current loading task. Previously l0 segment was ignored but level zero
segment may actually cost lots of memory.
This PR adds back memory resource check for Level zero segment loading.
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
See also #30651
Append operator of `std::filesystem::path` will replace whole path when
the param of "/" operation is an absolute path.
In "All-in-one" mode, this shall cause ChunkCache removing the original
vector data file when building chunk cache during/after load procedure.
This PR changes the ChunkCache path generation logic to a separate
function in which will check whether the file path is absolute or not.
If the file path is absolute, it removes the root path prefix and return
concatenated file path.
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Refine compaction interfaces in datacoord, support compaction result
with more than one segment. Prepare for major compaction.
related: #30633
Signed-off-by: wayblink <anyang.wang@zilliz.com>
1. Increase maxCount of L0 compaction tasks to 30
This could reduce the l0 compaction task number by 30% for
high-frequently-generated-small l0 segments, with the maximum size 64MB
stay not changed. So that l0 segments would accumulate slower and
decrease the mem presure caused by L0 segment for QueryNode
2. Add force Trigger for later manual timely l0 compaction triggers.
See also: #30191, #30556
Signed-off-by: yangxuan <xuan.yang@zilliz.com>
issue: #30553
when datacoord with version 2.2 and querycoord with version 2.3 coexist
during rolling upgrade, `DescribeIndex/GetIndexInfo` will return
`unimplemented` error
This PR add retry on `DescribeIndex/GetIndexInfo`, to prevent load
collection failed during rolling upgrade from milvus 2.2 to 2.3.
---------
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
flush rate control at collection level to avoid generate too much
segment.
0.1 qps by default.
issue: #29477
Signed-off-by: chyezh <ye.zhen@zilliz.com>
See also #30571
When `compactionExecutor` stops one compaction task, the `stop` method
will case `injectDone` called.
However in `executeTask` when `compact` method returns error, it shall
also invoke `injectDone` as well. That the reason `Unlock of unlocked
RWMutex` panicking happened.
This PR add sync.Once to make sure that `injectDone` is called only
once. We did not remove any of the `injectDone` since removal any of
those invocation may cause logic problem.
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Data write through rawkv API may pollute tikv data. It should be
disallowed.
We will add this check to all repos that involves metadata access.
In the longer term, we should have a metadata service that implements
access control.
relate: #30029
Signed-off-by: yiwangdr <yiwangdr@gmail.com>
This PR mainly improve two items:
1. Target observer should refresh loading status during init time. An
uninitialized loading status blocks search/query. Currently, the target
observer refreshes every 10 seconds, i.e. we'd need to wait for 10s for
no reason. That's also the reason why we constantly see false log
"collection unloaded" upon mixcoord restarts.
2. Delete session when service is stopped. So that the new service
doesn't need to wait for the previous session to expire (~10s).
Item 1 is the major improvement of this PR, which should speed up init
time by 10s.
Item 2 is not a big concern in most cases as coordinators usually shut
down after stop(). In those cases, coordinator restart triggers serverID
change which further triggers an existing logic that deletes expired
session. This PR only fixes rare cases where serverID doesn't change.
integration test:
`go test -tags dynamic -v -coverprofile=profile.out -covermode=atomic
tests/integration/coordrecovery/coord_recovery_test.go -timeout=20m`
Performance after the change:
Average init time of coordinators: 10s
Hardware: M2 Pro
Test setup: 1000 collections with 1000 rows (dim=128) per collection.
issue: #29409
Signed-off-by: yiwangdr <yiwangdr@gmail.com>
See also #27675#30469
For a sync task, the segment could be compacted during sync task. In
previous implementation, this sync task will hold only the old segment
id as KeyLock, in which case compaction on compacted to segment may run
in parallel with delta sync of this sync task.
This PR introduces sync target segment verification logic. It shall
check target segment lock it's holding beforing actually syncing logic.
If this check failed, sync task shall return`errTargetSegementNotMatch`
error and make manager re-fetch the current target segment id.
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>