milvus

mirror of https://gitee.com/milvus-io/milvus.git synced 2024-12-02 11:59:00 +08:00

Author	SHA1	Message	Date
zhagnlu	804dd5409a	enhance: mark duplicated pk as deleted (#34586 ) fix #34247 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2024-07-16 14:25:39 +08:00
chyezh	259a682673	enhance: async search and retrieve in cgo (#33228 ) issue: #30926, #33132 related pr: #33133 --------- Signed-off-by: chyezh <chyezh@outlook.com>	2024-06-22 09:38:02 +08:00
wei liu	ab93d9c23d	enhance: Use BatchPkExist to reduce bloom filter func call cost (#33611 ) issue:#33610 Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-06-13 17:57:56 +08:00
chyezh	8ca5ced821	fix: async warmup will be blocked by state lock (#33686 ) issue: #33685 Signed-off-by: chyezh <chyezh@outlook.com>	2024-06-10 21:59:53 +08:00
wei liu	c6a1c49e02	enhance: Use Blocked Bloom Filter instead of basic bloom fitler impl. (#33405 ) issue: #32995 To speed up the construction and querying of Bloom filters, we chose a blocked Bloom filter instead of a basic Bloom filter implementation. WARN: This PR is compatible with old version bf impl, but if fall back to old milvus version, it may causes bloom filter deserialize failed. In single Bloom filter test cases with a capacity of 1,000,000 and a false positive rate (FPR) of 0.001, the blocked Bloom filter is 5 times faster than the basic Bloom filter in both querying and construction, at the cost of a 30% increase in memory usage. - Block BF construct time {"time": "54.128131ms"} - Block BF size {"size": 3021578} - Block BF Test cost {"time": "55.407352ms"} - Basic BF construct time {"time": "210.262183ms"} - Basic BF size {"size": 2396308} - Basic BF Test cost {"time": "192.596229ms"} In multi Bloom filter test cases with a capacity of 100,000, an FPR of 0.001, and 100 Bloom filters, we reuse the primary key locations for all Bloom filters to avoid repeated hash computations. As a result, the blocked Bloom filter is also 5 times faster than the basic Bloom filter in querying. - Block BF TestLocation cost {"time": "529.97183ms"} - Basic BF TestLocation cost {"time": "3.197430181s"} --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-05-31 17:49:45 +08:00
Jiquan Long	0c5d8660aa	feat: support inverted index for array (#33452 ) issue: https://github.com/milvus-io/milvus/issues/27704 --------- Signed-off-by: longjiquan <jiquan.long@zilliz.com>	2024-05-31 09:47:47 +08:00
jaime	0d3272ed6d	enhance: refine logs of cgo pool (#33373 ) Signed-off-by: jaime <yun.zhang@zilliz.com>	2024-05-27 19:06:11 +08:00
jaime	58ee613fea	enhance: remove repeated stats of loaded entity (#33255 ) Signed-off-by: jaime <yun.zhang@zilliz.com>	2024-05-27 01:49:41 +08:00
yihao.dai	760223f80a	fix: use seperate warmup pool and disable warmup by default (#33348 ) 1. use a small warmup pool to reduce the impact of warmup 2. change the warmup pool to nonblocking mode 3. disable warmup by default 4. remove the maximum size limit of 16 for the load pool issue: https://github.com/milvus-io/milvus/issues/32772 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com> Co-authored-by: xiaofanluan <xiaofan.luan@zilliz.com>	2024-05-27 01:25:40 +08:00
Bingyi Sun	0f8c6f49ff	enhance: mmap load raw data if scalar index does not have raw data (#33175 ) Signed-off-by: sunby <sunbingyi1992@gmail.com>	2024-05-21 11:53:39 +08:00
cai.zhang	6ea7633bd5	enhance: Add memory size for binlog (#33025 ) issue: #33005 1. add `MemorySize` field for insert binlog. 2. `LogSize` means the file size in the storage object. 3. `MemorySize` means the size of the data in the memory. --------- Signed-off-by: Cai Zhang <cai.zhang@zilliz.com> Signed-off-by: cai.zhang <cai.zhang@zilliz.com>	2024-05-15 12:59:34 +08:00
chyezh	96489b814d	fix: remove busy log (#33042 ) issue: #32963 Signed-off-by: chyezh <chyezh@outlook.com>	2024-05-14 14:20:32 +08:00
wei liu	5038036ece	enhance: Reuse hash locations during access bloom fitler (#32642 ) issue: #32530 when try to match segment bloom filter with pk, we can reuse the hash locations. This PR maintain the max hash Func, and compute hash location once for all segment, reuse hash location can speed up bf access --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-05-07 06:13:47 -07:00
congqixia	40728ce83d	enhance: Add `metautil.Channel` to convert string compare to int (#32749 ) See also #32748 This PR: - Add `metautil.Channel` utiltiy which convert virtual name to physical channel name, collectionID and shard idx - Add channel mapper interface & implementation to convert limited physical channel name into int index - Apply `metautil.Channel` filter in querynode segment manager logic --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-05-07 19:13:35 +08:00
yihao.dai	9db3aa18bc	enhance: Remove deprecated EnableIndex (#32704 ) /kind improvement Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-05-07 17:11:30 +08:00
congqixia	7102403a6b	fix: Add Wrapper and Keepalive for CTraceContext ids (#32746 ) See also #32742 --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-05-07 10:05:35 +08:00
Bingyi Sun	fecd9c21ba	feat: LRU cache implementation (#32567 ) issue: https://github.com/milvus-io/milvus/issues/32783 This pr is the implementation of lru cache on branch lru-dev. Signed-off-by: sunby <sunbingyi1992@gmail.com> Co-authored-by: chyezh <chyezh@outlook.com> Co-authored-by: MrPresent-Han <chun.han@zilliz.com> Co-authored-by: Ted Xu <ted.xu@zilliz.com> Co-authored-by: jaime <yun.zhang@zilliz.com> Co-authored-by: wayblink <anyang.wang@zilliz.com>	2024-05-06 20:29:30 +08:00
Jiquan Long	c002745902	enhance: retrieve output fields after local reduce (#32346 ) issue: #31822 --------- Signed-off-by: longjiquan <jiquan.long@zilliz.com>	2024-04-25 09:49:26 +08:00
yihao.dai	281a583eda	fix: Correct the negative queryable num entities metric (#32361 ) issue: https://github.com/milvus-io/milvus/issues/32281 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-04-24 15:55:24 +08:00
Chun Han	f3f2a5a7e9	fix: evicted segments in the serverlss mode(#31959 ) (#31961 ) related: #31959 1. reset segment index status after evicting to lazyload=true 2. reset num_rows to null_opt Signed-off-by: MrPresent-Han <chun.han@zilliz.com>	2024-04-10 15:15:19 +08:00
chyezh	73adf2a5cc	fix: use stateful lock to avoid load and release on LocalSegment concurrently (#31606 ) issue: #31605 --------- Signed-off-by: chyezh <chyezh@outlook.com>	2024-04-08 17:09:16 +08:00
jaime	bd853be8c7	enhance: Add db label for some usual metrics (#30956 ) issue: #31782 Signed-off-by: jaime <yun.zhang@zilliz.com>	2024-04-02 14:27:13 +08:00
chyezh	1ad5ccc50f	enhance: add rg and db interface for segment and db/rg metric label (#31715 ) issue: #30931 Signed-off-by: chyezh <chyezh@outlook.com>	2024-04-01 10:21:21 +08:00
Bingyi Sun	3d66670619	fix: add lazy load field to mark segment load type (#31591 ) issue: https://github.com/milvus-io/milvus/issues/31673 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2024-03-28 11:23:10 +08:00
SimFG	b1a1cca10b	feat: add more operation detail info for better allocation (#30438 ) issue: #30436 --------- Signed-off-by: SimFG <bang.fu@zilliz.com>	2024-03-28 06:33:11 +08:00
Bingyi Sun	8e661f791a	fix: lazy load index data in cache (#31094 ) issue: https://github.com/milvus-io/milvus/issues/31571 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2024-03-25 15:43:07 +08:00
chyezh	8e293dc1ce	enhance: add resource usage estimate for segment interface (#31050 ) issue: #30931 - move resource estimate function outside from segment loader. - add load info and collection to base segment. - add resource usage method for sealed segment. Signed-off-by: chyezh <chyezh@outlook.com>	2024-03-19 11:53:05 +08:00
Bingyi Sun	fd17a5f050	fix: check collection lazy load prop using schema (#30992 ) issue: https://github.com/milvus-io/milvus/issues/30361 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2024-03-06 16:19:01 +08:00
congqixia	30398d4b71	enhance: Fix misleading log content & possible nil panic (#31021 ) - Change load field log from "dy pool" to "load pool" - Also defer delete when there is no error Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-03-05 16:33:00 +08:00
congqixia	4082315bd0	enhance: Add `ParseCTraceContext` util function for tracing (#30883 ) See also #29803 Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-02-28 18:59:00 +08:00
Cai Yudong	8a219e0102	feat: Support knowhere trace using OpenTelemetry (#30750 ) Issue: #21508 Signed-off-by: Yudong Cai <yudong.cai@zilliz.com>	2024-02-28 12:29:00 +08:00
yah01	57397b1307	enhance: add new LRU cache impl (#30360 ) - remove the unused LRU cache - add new LRU cache impl which wraps github.com/karlseguin/ccache related #30361 --------- Signed-off-by: yah01 <yang.cen@zilliz.com>	2024-02-27 20:58:40 +08:00
congqixia	e5a16050ce	fix: Update disk usage metrics after segment released (#30702 ) See also #30701 Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-02-20 22:14:53 +08:00
congqixia	405877c8cd	fix: Use correct pools for all CGO methods in segments pkg (#30274 ) See also #30273 This PR: - Rename confusing `LoadIndexInfo` to `UpdateIndexInfo` for LocalSegment - Use `DynamicPool` instead of `LoadPool` for `UpdateSealedSegmentIndex` - Fix cgo call missing pool control Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-01-26 10:09:00 +08:00
yihao.dai	c02fb64ad6	enhance: Allows proactive warming up of chunk cache (#30182 ) Allows proactive warming up of chunk cache. Original vector data will be asynchronously loaded into the chunk cache during the load process. It has the potential to significantly reduce query/search latency for a certain duration after the load, albeit with a concurrent increase in disk usage. issue: https://github.com/milvus-io/milvus/issues/30181 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-01-25 19:55:39 +08:00
yah01	9a3837212c	enhance: add index after load succeeded (#30015 ) this avoids a corner case: after load index failed, this index can be never loaded as it has been added into the segment's index map Signed-off-by: yah01 <yang.cen@zilliz.com>	2024-01-17 15:06:53 +08:00
chyezh	d300bc7bcb	fix: querynode num entity metric is broken by illegal label (#29948 ) issue: #29766 also see pr: #29825 Signed-off-by: chyezh <ye.zhen@zilliz.com>	2024-01-14 10:23:00 +08:00
Bingyi Sun	e1258b8cad	feat: integrate storagev2 into loading segment (#29336 ) issue: #29335 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2024-01-12 18:10:51 +08:00
yah01	26e900180e	fix: the insert count is zero after set the pointer to nil (#29870 ) this leads to the EntitiesNum metric would be never reduced fix: #29766 Signed-off-by: yah01 <yang.cen@zilliz.com>	2024-01-11 14:56:50 +08:00
yah01	44fe06f198	enhance: skip loading duplicated index (#29715 ) this protect the loading index from failure, and speed up the loading progress Signed-off-by: yah01 <yang.cen@zilliz.com>	2024-01-11 11:52:49 +08:00
congqixia	d6429933a7	enhance: make Load process traceable in querynode & segcore (#29858 ) See also #29803 This PR: - Add trace span for `LoadIndex` & `LoadFieldData` in segment loader - Add `TraceCtx` parameter for `Index.Load` in segcore - Add span for ReadFiles & Engine Load for Memory/Disk Vector index --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-01-10 21:58:51 +08:00
yah01	d357139064	fix: the entities num metric may be contributed more than once (#29767 ) the growing segments contribute to this metric while inserting and putting into the manager, but the current impl inserts data before putting the segments into manager, which leads to double contributions fix: #29766 Signed-off-by: yah01 <yah2er0ne@outlook.com>	2024-01-10 10:00:51 +08:00
zhenshan.cao	60e88fb833	fix: Restore the MVCC functionality. (#29749 ) When the TimeTravel functionality was previously removed, it inadvertently affected the MVCC functionality within the system. This PR aims to reintroduce the internal MVCC functionality as follows: 1. Add MvccTimestamp to the requests of Search/Query and the results of Search internally. 2. When the delegator receives a Query/Search request and there is no MVCC timestamp set in the request, set the delegator's current tsafe as the MVCC timestamp of the request. If the request already has an MVCC timestamp, do not modify it. 3. When the Proxy handles Search and triggers the second phase ReQuery, divide the ReQuery into different shards and pass the MVCC timestamp to the corresponding Query requests. issue: #29656 Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>	2024-01-09 11:38:48 +08:00
congqixia	fe47deebf3	fix: Set & Return correct SegmentLevel in querynode segment manager (#29740 ) See also #27349 The segment level label in querynode used `Legacy` before segment level was correctly passed in Load request. Now this attribute is still using legacy so the metrics does not look right. This PR add paramter for `NewSegment` and passes corrent values for each invocation. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-01-08 14:16:48 +08:00
wei liu	b45d08b47b	enhance: Add ctx for load index logs (#29686 ) This PR add ctx for load index logs Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-01-05 14:24:49 +08:00
congqixia	b251c3a682	enhance: add ctx for HandleCStatus and callers (#29517 ) See also #29516 Make `HandleCStatus` print trace id for better logging Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-12-27 16:10:47 +08:00
yah01	b8318fcd7d	enhance: improve the handling for segcore error (#29471 ) - fix lost exception details in segcore - improve the logs of handling errors from segcore Signed-off-by: yah01 <yang.cen@zilliz.com>	2023-12-26 14:06:46 +08:00
yah01	a0e1a1eb31	feat: support enable/disable mmap for index (#29005 ) support enable/disable mmap for index, the user could alter the index's mode by `AlterIndex` method related: https://github.com/milvus-io/milvus/issues/21866 --------- Signed-off-by: yah01 <yah2er0ne@outlook.com> Signed-off-by: yah01 <yang.cen@zilliz.com>	2023-12-21 18:07:24 +08:00
yah01	61fc822207	fix: creating growing segments may introduce many threads (#29306 ) many growing segments may be created in a short time and there is no restriction to the process, the CGO call will leave many threads related: #29282 Signed-off-by: yah01 <yah2er0ne@outlook.com>	2023-12-20 10:06:43 +08:00
congqixia	dcb662d9ed	enhance: Refine C.NewSegment response and handle exception (#28952 ) See also #28795 Orignal `C.NewSegment` may panic if some condition is not met, this pr changes response struct to `CNewSegmentResult`, which contains `C.CStatus` and may return catched exception --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2023-12-07 13:34:35 +08:00

1 2

90 Commits