milvus

mirror of https://gitee.com/milvus-io/milvus.git synced 2024-12-05 05:18:52 +08:00

Author	SHA1	Message	Date
yihao.dai	ca758c36cc	enhance: Pre-allocate ids for compaction (#34187 ) This PR removes the dependency of compaction on the ID allocator by pre-allocating the logID and segmentID. issue: https://github.com/milvus-io/milvus/issues/33957 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-07-17 13:23:42 +08:00
yihao.dai	4e5f1d5f75	enhance: Pre-allocate ids for import (#33958 ) The import is dependent on syncTask, which in turn relies on the allocator. This PR pre-allocate the necessary IDs for import syncTask. issue: https://github.com/milvus-io/milvus/issues/33957 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-07-07 21:26:14 +08:00
congqixia	962a5446f8	enhance: Add ctx in `SyncTask.Run` to be cancellable (#34042 ) Related to #33716 This PR add context param in SyncTask.Run execution functions to make it cancellable from the caller. This make it possible to cancel task when datanode/data sync service is beeing shut down. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-06-25 14:22:04 +08:00
congqixia	506a915272	fix: Deep copy ImportTask.segmentsInfo to prevent data race (#34090 ) See also #34089 Signed-off-by: Congqi Xia <congqi.xia@zilliz.com> Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-06-25 10:06:02 +08:00
congqixia	512ea6be5f	enhance: Avoid merging insert data when buffering insert msgs (#33562 ) See also #33561 This PR: - Use zero copy when buffering insert messages - Make `storage.InsertCodec` support serialize multiple insert data chunk into same batch binlog files Signed-off-by: Congqi Xia <congqi.xia@zilliz.com> --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-06-13 11:15:56 +08:00
yihao.dai	eb5d4de390	fix: Check if the import job exists (#33672 ) issue: https://github.com/milvus-io/milvus/issues/33671 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-06-10 21:51:55 +08:00
yihao.dai	3540eee977	enhance: Support L0 import (#33514 ) issue: https://github.com/milvus-io/milvus/issues/33157 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-06-07 14:17:20 +08:00
yihao.dai	bbdf99a45e	fix: Fix import segment size is uneven (#33605 ) The data coordinator computed the appropriate number of import segments, thus when importing in the data node, one can randomly select a segment. issue: https://github.com/milvus-io/milvus/issues/33604 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-06-05 15:41:51 +08:00
aoiasd	387b7cd7f4	enhance:avoid maintain checkpoint info in sync manager (#33413 ) relate: https://github.com/milvus-io/milvus/issues/32915 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2024-06-05 10:05:50 +08:00
yihao.dai	895799ec61	enhance: Abstract Execute interface for import/preimport task (#33234 ) Abstract Execute interface for import/preimport task, simplify import scheduler. issue: https://github.com/milvus-io/milvus/issues/33157 Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-05-23 11:29:41 +08:00
congqixia	2c1e8f4774	enhance: Use `struct{}` for sync task future result (#32673 ) Related to #27675 Use `struct{}` instead `error` for sync task future result type to reduce result size and preventing logci error. Also change some unused parameter to `_` to suppress lint warning Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-04-29 10:59:26 +08:00
yihao.dai	558feed5ed	fix: Use pk from binlog during import (#32118 ) During binlog import, even if the primary key's autoID is set to true, the primary key from the binlog should be used instead of being reassigned. issue: https://github.com/milvus-io/milvus/discussions/31943, https://github.com/milvus-io/milvus/issues/28521 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-04-16 14:51:20 +08:00
yihao.dai	aa96843d31	fix: Fix import hanging and improve logging output (#32166 ) Fix import hanging when the previous import task failed, and improve parquet import logging outout. issue: https://github.com/milvus-io/milvus/issues/31834 Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-04-13 22:03:23 +08:00
yihao.dai	49d109de18	enhance: Use an individual buffer size parameter for imports (#31833 ) Use an individual buffer size parameter for imports and set buffer size to 64MB. issue: https://github.com/milvus-io/milvus/issues/28521 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-04-08 21:07:18 +08:00
yihao.dai	4e264003bf	enhance: Ensure ImportV2 waits for the index to be built and refine some logic (#31629 ) Feature Introduced: 1. Ensure ImportV2 waits for the index to be built Enhancements Introduced: 1. Utilization of local time for timeout ts instead of allocating ts from rootcoord. 3. Enhanced input file length check for binlog import. 4. Removal of duplicated manager in datanode. 5. Renaming of executor to scheduler in datanode. 6. Utilization of a thread pool in the scheduler in datanode. issue: https://github.com/milvus-io/milvus/issues/28521 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-04-01 20:09:13 +08:00
yihao.dai	31cf849f68	enhance: Support retriving file size from importutilv2.Reader (#31533 ) To reduce the overhead caused by listing the S3 objects, add an interface to importutil.Reader to retrieve file sizes. issue: https://github.com/milvus-io/milvus/issues/31532, https://github.com/milvus-io/milvus/issues/28521 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-03-25 20:29:07 +08:00
yihao.dai	f65a796d18	enhance: Add max file num limit and max file size limit for import (#31497 ) The max number of import files per request should not exceed 1024 by default (configurable). The import file size allowed for importing should not exceed 16GB by default (configurable). issue: https://github.com/milvus-io/milvus/issues/28521 Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-03-22 18:13:06 +08:00
yihao.dai	776709e5ff	fix: Fix binlog import (#31310 ) Fix binlog import functionality by removing the existing check and refining the size retrieval process. issue: https://github.com/milvus-io/milvus/issues/31221, https://github.com/milvus-io/milvus/issues/28521 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-03-17 20:59:04 +08:00
yihao.dai	811316d2ba	fix: Fix binlog import and refine error reporting (#31241 ) 1. Fix binlog import with partition key. 2. Refine binlog import error reportins. 3. Avoid division by zero when retrieving import progress. issue: https://github.com/milvus-io/milvus/issues/31221, https://github.com/milvus-io/milvus/issues/28521 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-03-15 10:55:05 +08:00
yihao.dai	b5c67948b7	enhance: Enhance and modify the return content of ImportV2 (#31192 ) 1. The Import APIs now provide detailed progress information for each imported file, including details such as file name, file size, progress, and more. 2. The APIs now return the collection name and the completion time. 3. Other modifications include changing jobID to jobId and other similar adjustments. issue: https://github.com/milvus-io/milvus/issues/28521 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-03-13 19:51:03 +08:00
yihao.dai	a434d33e75	feat: Add import scheduler and manager (#29367 ) This PR introduces novel managerial roles for importv2: 1. ImportMeta: To manage all the import tasks; 2. ImportScheduler: To process tasks and modify their states; 3. ImportChecker: To ascertain the completion of all tasks and instigate relevant operations. issue: https://github.com/milvus-io/milvus/issues/28521 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-03-01 18:31:02 +08:00
yihao.dai	18b979d9b4	enhance: Extend support for varchar autoID to BulkInsertV2 (#30477 ) issue: https://github.com/milvus-io/milvus/issues/30476 Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-02-04 16:57:05 +08:00
yihao.dai	7ce876a072	fix: Decoupling importing segment from flush process (#30402 ) This pr decoups importing segment from flush process by: 1. Exclude the importing segment from the flush policy, this approch avoids notifying the datanode to flush the importing segment, which may not exist. 2. When RootCoord call Flush, DataCoord directly set the importing segment state to `Flushed`. issue: https://github.com/milvus-io/milvus/issues/30359 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-02-03 13:01:12 +08:00
yihao.dai	c5918290e6	feat: Add import executor and manager for datanode (#29438 ) This PR introduces novel importv2 roles for datanode: 1. Executor: To execute tasks, a import task will be divided into the following steps: read data -> hash data -> sync data; 2. Manager: To manage all the tasks; issue: https://github.com/milvus-io/milvus/issues/28521 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-01-31 20:45:04 +08:00

24 Commits