When there are a large number of segments, the metrics consume a lot of
memory. This PR Remove segment-level tag from monitoring metrics.
issue: https://github.com/milvus-io/milvus/issues/37636
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
issue: ##36621
- For simple types in a struct, add "string" to the JSON tag for
automatic string conversion during JSON encoding.
- For complex types in a struct, replace "int64" with "string."
Signed-off-by: jaime <yun.zhang@zilliz.com>
issue: #33285
- Modify the proto of consumer of streaming service.
- Make VChannel as a required option for streaming
---------
Signed-off-by: chyezh <chyezh@outlook.com>
issue: #37172
- add redo interceptor to implement append context refresh. (make new
timetick)
- add create segment handler for flusher.
- make empty segment flushable and directly change it into dropped.
- add create segment message into wal when creating new growing segment.
- make the insert operation into following seq: createSegment -> insert
-> insert -> flushSegment.
- make manual flush into following seq: flushTs -> flushsegment ->
flushsegment -> manualflush.
---------
Signed-off-by: chyezh <chyezh@outlook.com>
issue: #33744
1. Segments generated from inserts will be loaded as growing until they
are sorted by primary key.
2. This PR may increase memory pressure on the delegator, but we need to
test the performance of stats. In local testing, the speed of stats is
greater than the insert speed.
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
issue: #36858
- Start channel manager on datacoord, but with empty assign policy in
streaming service.
- Make collection at dropping state can be recovered by flusher to make
sure that
milvus consume the dropCollection message.
- Add backoff for flusher lifetime.
- remove the proxy watcher from timetick at rootcoord in streaming
service.
Also see the better fixup: #37176
---------
Signed-off-by: chyezh <chyezh@outlook.com>
issue: #37156
1. Still need to record the current stats version.
2. Set it to 0 when the current stats version is not found.
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
Also remove conflit check when executing L0. The exclusive is already
guarenteed in scheduler
See also: #37140
---------
Signed-off-by: yangxuan <xuan.yang@zilliz.com>
Timeout is a bad design for long running tasks, especially using a
static timeout config. We should monitor execution progress and fail the
task if the progress has been stale for a long time.
This pr is a small patch to stop DC from marking compaction tasks
timeout, while still waiting for DN to finish. The design is
self-conflicted. After this pr, mix and L0 compaction are no longer
controlled by DC timeout, but clustering is still under timeout control.
The compaction queue capacity grows larger for priority calc, hence
timeout compactions appears more often, and when timeout, the queuing
tasks will be timeout too, no compaction will success after.
See also: #37108, #37015
---------
Signed-off-by: yangxuan <xuan.yang@zilliz.com>
issue: #36621
1. Add API to access task runtime metrics, including:
- build index task
- compaction task
- import task
- balance (including load/release of segments/channels and some leader
tasks on querycoord)
- sync task
2. Add a debug model to the webpage by using debug=true or debug=false
in the URL query parameters to enable or disable debug mode.
Signed-off-by: jaime <yun.zhang@zilliz.com>
Related to #35303
This PR add metrics for querynode delegator delete buffer information,
which is related to dml quota logic.
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
issue: #36686
This pr will remove pre-marking segments as L2 during clustering
compaction in version 2.5, and ensure compatibility with version 2.4.
The core of this change is to **ensure that the many-to-many lineage
derivation logic is correct, making sure that both the parent and child
cannot simultaneously exist in the target segment view.**
feature:
- Clustering compaction no longer marks the input segments as L2.
- Add a new field `is_invisible` to `segmentInfo`, and mark segments
that have completed clustering but have not yet built indexes as
`is_invisible` to prevent them from being loaded prematurely."
- Do not mark the input segment as `Dropped` before the clustering
compaction is completed.
- After compaction fails, only the result segment needs to be marked as
Dropped.
compatibility:
- If the upgraded task has not failed, there are no compatibility
issues.
- If the status after the upgrade is `MetaSaved`, then skip the stats
task based on whether TmpSegments is empty.
- If the failure occurs before `MetaSaved`:
- there are no ResultSegments, and InputSegments have not been marked as
dropped yet.
- the level of input segments need to revert to LastLevel
- If the failure occurs after `MetaSaved`:
- ResultSegments have already been generated, and InputSegments have
been marked as Dropped. At this point, simply make the ResultSegments
visible.
- the level of ResultSegments needs to be set to L1(in order to
participate in mixCompaction)
---------
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
issue: #36686
bug reason:
- The clustering compaction tasks on the datanode were never cleaned up.
- The clustering compaction task contains a mapping from clustering key
to buffer, this caused a large memory leak.
fix:
- clean the tasks on datanode by datacoord when clustering compaction
finished.
- reset the mapping that from clustering key to buffer on datanode when
clustering finished.
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
issue: #34553
when rootcoord trigger graceful stop progress, it will block until all
rpc finished. for create collection request, rootcoord need to block
until datacoord finish to watch all channels, but datacoord need to call
`rootcoord.Alloc` during watch channel, and rootcoord doesn't respond to
new request anymore. which cause create collection stucks, and graceful
stop progress stucks.
This PR remove the func call `rootcoord.Alloc` to solve the logic dead
lock during graceful stop progress.
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
issue: #36868
if datacoord is syncing segments to datanode, and stop datacoord
happens, datacoord's stop progress will stuck until syncing segment
finished.
This PR add ctx to syncing segment, which will failed if stopping
datacoord happens.
Signed-off-by: Wei Liu <wei.liu@zilliz.com>