milvus

mirror of https://gitee.com/milvus-io/milvus.git synced 2024-11-29 10:28:41 +08:00

Author	SHA1	Message	Date
Zhen Ye	c6dcef7b84	enhance: move segcore codes of segment into one package (#37722 ) Some checks are pending Code Checker / Code Checker AMD64 Ubuntu 22.04 (push) Waiting to run Details Code Checker / Code Checker Amazonlinux 2023 (push) Waiting to run Details Code Checker / Code Checker rockylinux8 (push) Waiting to run Details Mac Code Checker / Code Checker MacOS 13 (push) Waiting to run Details Build and test / Build and test AMD64 Ubuntu 22.04 (push) Waiting to run Details Build and test / UT for Cpp (push) Blocked by required conditions Details Build and test / UT for Go (push) Blocked by required conditions Details Build and test / Integration Test (push) Blocked by required conditions Details Build and test / Upload Code Coverage (push) Blocked by required conditions Details issue: #33285 - move most cgo opeartions related to search/query into segcore package for reusing for streamingnode. - add go unittest for segcore operations. Signed-off-by: chyezh <chyezh@outlook.com>	2024-11-29 10:22:36 +08:00
Zhen Ye	3f1614e9d9	enhance: add trace_id into segcore logs (#37656 ) issue: #37655 Signed-off-by: chyezh <chyezh@outlook.com>	2024-11-18 10:20:30 +08:00
Zhen Ye	f07aa72589	enhance: make milvus image with asan available (#37050 ) issue: #35854 Signed-off-by: chyezh <chyezh@outlook.com>	2024-10-24 10:05:29 +08:00
yellow-shine	8902e2220e	enhance: enable asan for cpp unittest (#37041 ) https://github.com/milvus-io/milvus/issues/35854 Signed-off-by: chyezh <chyezh@outlook.com> Co-authored-by: chyezh <chyezh@outlook.com>	2024-10-23 17:21:27 +08:00
wei liu	d51a808851	fix: Rootcoord stuck at graceful stop progress (#36880 ) issue: #34553 when rootcoord trigger graceful stop progress, it will block until all rpc finished. for create collection request, rootcoord need to block until datacoord finish to watch all channels, but datacoord need to call `rootcoord.Alloc` during watch channel, and rootcoord doesn't respond to new request anymore. which cause create collection stucks, and graceful stop progress stucks. This PR remove the func call `rootcoord.Alloc` to solve the logic dead lock during graceful stop progress. Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-10-17 12:15:25 +08:00
Bingyi Sun	6851738fd1	fix: fix `make generate-mockery` panic with go1.22 (#36830 ) https://github.com/milvus-io/milvus/issues/36831 Fix `make generate-mockery` panic. Signed-off-by: sunby <sunbingyi1992@gmail.com>	2024-10-17 12:11:31 +08:00
aoiasd	db34572c56	feat: support load and query with bm25 metric (#36071 ) relate: https://github.com/milvus-io/milvus/issues/35853 --------- Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2024-10-11 10:23:20 +08:00
yihao.dai	a61668c77e	feat: Introduce stats task for import (#35868 ) This PR introduce stats task for import: 1. Define new `Stats` and `IndexBuilding` states for importJob 2. Add new stats step to the import process: trigger the stats task and wait for its completion 3. Abort stats task if import job failed issue: https://github.com/milvus-io/milvus/issues/33744 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-09-15 15:17:08 +08:00
yihao.dai	ef451f5e1f	enhance: Add build-go target (#35844 ) Add a `build-go` target to the Makefile that only compiles Go. issue: https://github.com/milvus-io/milvus/issues/35611 Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-09-01 17:05:09 +08:00
zhenshan.cao	d10aa4626f	enhance: [skip e2e] add make run-test-cpp with support for filter gtest (#35829 ) Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>	2024-08-29 20:03:02 +08:00
Zhen Ye	99dff06391	enhance: using streaming service in insert/upsert/flush/delete/querynode (#35406 ) issue: #33285 - using streaming service in insert/upsert/flush/delete/querynode - fixup flusher bugs and refactor the flush operation - enable streaming service for dml and ddl - pass the e2e when enabling streaming service - pass the integration tst when enabling streaming service --------- Signed-off-by: chyezh <chyezh@outlook.com> Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-08-29 10:03:08 +08:00
Zhen Ye	75da36d1aa	enhance: enable asan for milvus (#35627 ) issue: #35626 Signed-off-by: chyezh <chyezh@outlook.com>	2024-08-23 21:06:58 +08:00
congqixia	582d2eec79	enhance: Move datanode/indexnode manager to session pkg (#35634 ) Related to #28861 Move session manager, worker manager to session package. Also renaming each manager to corresponding node name(datanode, indexnode). --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-08-22 16:02:56 +08:00
congqixia	b4743b4ca8	enhance: [GoSDK] Sync latest milvus proto and mockery (#35511 ) Related to #35443 --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-08-19 16:22:55 +08:00
congqixia	2736a8b88c	enhance: Update Makefile to generate mockery (#35517 ) Some mockery cmd is out-of-date and fail to work. This PR update these commands to match current pkg. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-08-16 18:44:53 +08:00
smellthemoon	fa41c7960e	fix:use echo instead of print in Makefile (#35373 ) #35372 Signed-off-by: lixinguo <xinguo.li@zilliz.com> Co-authored-by: lixinguo <xinguo.li@zilliz.com>	2024-08-09 14:16:17 +08:00
congqixia	634eadd0a1	enhance: Remove duplicated `generate-mockery-flushcommon` target (#35331 ) Makefile contains two `generate-mockery-flushcommon` targets and cause unwanted warning during building. This PR removes one with less commands to fix the problem. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-08-07 14:32:22 +08:00
shaoyue	1db099973f	enhance: Gin & restfulv1 handler use sonic json (#35020 ) /cc @czs007 @congqixia /kind enhancement Signed-off-by: haorenfsa <haorenfsa@gmail.com>	2024-08-05 16:44:17 +08:00
yihao.dai	a4439cc911	enhance: Implement flusher in streamingNode (#34942 ) - Implement flusher to: - Manage the pipelines (creation, deletion, etc.) - Manage the segment write buffer - Manage sync operation (including receive flushMsg and execute flush) - Add a new `GetChannelRecoveryInfo` RPC in DataCoord. - Reorganize packages: `flushcommon` and `datanode`. issue: https://github.com/milvus-io/milvus/issues/33285 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-08-02 18:30:23 +08:00
zhenshan.cao	aa247f192d	enhance: remove unused code for StorageV2 (#35132 ) issue: https://github.com/milvus-io/milvus/issues/34168 Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>	2024-08-01 12:08:13 +08:00
congqixia	de8a266d8a	enhance: Enable linux code checker (#35084 ) See also #34483 --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-07-30 15:53:51 +08:00
congqixia	d36cfe71e5	enhance: Refine protobuf dependency installation in Makefile (#35072 ) Related to #34394 --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-07-29 21:15:48 +08:00
wei liu	c45f38aa61	enhance: Update protobuf-go to protobuf-go v2 (#34394 ) issue: #34252 Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-07-29 11:31:51 +08:00
chyezh	1cff55381d	enhance: add manual alloc segment rpc for datacoord (#35002 ) issue: #33285 - segment allocation will move to streamingnode, so a manual alloc segment rpc is required Signed-off-by: chyezh <chyezh@outlook.com>	2024-07-26 10:15:46 +08:00
Ted Xu	63f0154dfb	fix: enable milvus.yaml check (#34567 ) See #32168 --------- Signed-off-by: Ted Xu <ted.xu@zilliz.com>	2024-07-25 18:05:46 +08:00
chyezh	fda720b880	enhance: streaming service grpc utilities (#34436 ) issue: #33285 - add two grpc resolver (by session and by streaming coord assignment service) - add one grpc balancer (by serverID and roundrobin) - add lazy conn to avoid block by first service discovery - add some utility function for streaming service Signed-off-by: chyezh <chyezh@outlook.com>	2024-07-15 20:49:38 +08:00
congqixia	776ffee840	enhance: Tag gotestsum version when install deps (#34308 ) Tagging gotestsum by ldflags to prevent reinstall gotestsum binary each local run Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-07-03 14:44:13 +08:00
jaime	d6afb31b94	enhance: make subfunctions of datanode component modular (#33992 ) issue: #33994 also remove deprecated channel manager based on the etcd implementation Signed-off-by: jaime <yun.zhang@zilliz.com>	2024-07-01 14:46:07 +08:00
chyezh	b9237280c2	enhance: wal interface definition (#33745 ) issue: #33285 --------- Signed-off-by: chyezh <chyezh@outlook.com>	2024-06-24 10:34:12 +08:00
congqixia	35ea775c14	enhance: Add rules and fix for go_client e2e code style (#34033 ) See also #31293 --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-06-21 10:32:02 +08:00
chyezh	2b7ee1968f	enhance: new messsage interface for log service (#33286 ) issue: #33285 --------- Signed-off-by: chyezh <chyezh@outlook.com>	2024-06-11 10:38:01 +08:00
chyezh	f53ab54c5d	enhance: async cgo utility (#33133 ) issue: #30926, #33132 - implement future-based cgo utility. --------- Signed-off-by: chyezh <chyezh@outlook.com>	2024-06-09 22:55:53 +08:00
cai.zhang	27cc9f2630	enhance: Support analyze data (#33651 ) issue: #30633 Signed-off-by: Cai Zhang <cai.zhang@zilliz.com> Co-authored-by: chasingegg <chao.gao@zilliz.com>	2024-06-06 17:37:51 +08:00
zhenshan.cao	ac4f3997ce	enhance: Reconstructing Compaction to possess persistence capability (#33265 ) issue #33586 Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>	2024-06-05 10:17:50 +08:00
XuanYang-cn	22bddde5ff	enhance: Tidy compactor and remove dup codes (#32198 ) See also: #32451 Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2024-05-23 09:53:40 +08:00
shaoting-huang	de7901121f	Upgrade go from 1.20 to 1.21 (#33047 ) Signed-off-by: shaoting-huang [shaoting-huang@zilliz.com] issue: https://github.com/milvus-io/milvus/issues/32982 # Background Go 1.21 introduces several improvements and changes over Go 1.20, which is quite stable now. According to [Go 1.21 Release Notes](https://tip.golang.org/doc/go1.21), the big difference of Go 1.21 is enabling Profile-Guided Optimization by default, which can improve performance by around 2-14%. Here are the summary steps of PGO: 1. Build Initial Binary (Without PGO) 2. Deploying the Production Environment 3. Run the program and collect Performance Analysis Data (CPU pprof) 4. Analyze the Collected Data and Select a Performance Profile for PGO 5. Place the Performance Analysis File in the Main Package Directory and Name It default.pgo 6. go build Detects the default.pgo File and Enables PGO 7. Build and Release the Updated Binary (With PGO) 8. Iterate and Repeat the Above Steps <img width="657" alt="Screenshot 2024-05-14 at 15 57 01" src="https://github.com/milvus-io/milvus/assets/167743503/b08d4300-0be1-44dc-801f-ce681dabc581"> # What does this PR do There are three experiments, search benchmark by Zilliz test platform, search benchmark by open-source [VectorDBBench](https://github.com/zilliztech/VectorDBBench?tab=readme-ov-file), and search benchmark with PGO. We do both search benchmarks by Zilliz test platform and by VectorDBBench to reduce reliance on a single experimental result. Besides, we validate the performance enhancement with PGO. ## Search Benchmark Report by Zilliz Test Platform An upgrade to Go 1.21 was conducted on a Milvus Standalone server, equipped with 16 CPUs and 64GB of memory. The search performance was evaluated using a 1 million entry local dataset with an L2 metric type in a 768-dimensional space. The system was tested for concurrent searches with 50 concurrent tasks for 1 hour, each with a 20-second interval. The reason for using one server rather than two servers to compare is to guarantee the same data source and same segment state after compaction. Test Sequence: 1. Go 1.20 Initial Run: Insert data, build index, load index, and search. 2. Go 1.20 Rebuild: Rebuild the index with the same dataset, load index, and search. 3. Go 1.21 Load: Upload to Go 1.21 within the server. Then load the index from the second run, and search. 4. Go 1.21 Rebuild: Rebuild the index with the same dataset, load index, and search. Search Metrics: \| Metric \| Go 1.20 \| Go 1.20 Rebuild Index \| Go 1.21 \| Go 1.21 Rebuild Index \| \|----------------------------\|------------------\|-----------------\|------------------\|-----------------\| \| `search requests` \| 10,942,683 \| 16,131,726 \| 16,200,887 \| 16,331,052 \| \| `search fails` \| 0 \| 0 \| 0 \| 0 \| \| `search RT_avg` (ms) \| 16.44 \| 11.15 \| 11.11 \| 11.02 \| \| `search RT_min` (ms) \| 1.30 \| 1.28 \| 1.31 \| 1.26 \| \| `search RT_max` (ms) \| 446.61 \| 233.22 \| 235.90 \| 147.93 \| \| `search TP50` (ms) \| 11.74 \| 10.46 \| 10.43 \| 10.35 \| \| `search TP99` (ms) \| 92.30 \| 25.76 \| 25.36 \| 25.23 \| \| `search RPS` \| 3,039 \| 4,481 \| 4,500 \| 4,536 \| ### Key Findings The benchmark tests reveal that the index build time with Go 1.20 at 340.39 ms and Go 1.21 at 337.60 ms demonstrated negligible performance variance in index construction. However, Go 1.21 offers slightly better performance in search operations compared to Go 1.20, with improvements in handling concurrent tasks and reducing response times. ## Search Benchmark Report By VectorDb Bench Follow [VectorDBBench](https://github.com/zilliztech/VectorDBBench?tab=readme-ov-file) to create a VectorDb Bench test for Go 1.20 and Go 1.21. We test the search performance with Go 1.20 and Go 1.21 (without PGO) on the Milvus Standalone system. The tests were conducted using the Cohere dataset with 1 million entries in a 768-dimensional space, utilizing the COSINE metric type. Search Metrics: Metric \| Go 1.20 \| Go 1.21 without PGO -- \| -- \| -- Load Duration (seconds) \| 1195.95 \| 976.37 Queries Per Second (QPS) \| 841.62 \| 875.89 99th Percentile Serial Latency (seconds) \| 0.0047 \| 0.0076 Recall \| 0.9487 \| 0.9489 ### Key Findings Go 1.21 indicates faster index loading times and larger search QPS handling. ## PGO Performance Test Milvus has already added [net/http/pprof](https://pkg.go.dev/net/http/pprof) in the metrics. So we can curl the CPU profile directly by running `curl -o default.pgo "http://${MILVUS_SERVER_IP}:${MILVUS_SERVER_PORT}/debug/pprof/profile?seconds=${TIME_SECOND}"` to collect the profile as the default.pgo during the first search. Then I build Milvus with PGO and use the same index to run the search again. The result is as below: Search Metrics \| Metric \| Go 1.21 Without PGO \| Go 1.21 With PGO \| Change (%) \| \|---------------------------------------------\|------------------\|-----------------\|------------\| \| `search Requests` \| 2,644,583 \| 2,837,726 \| +7.30% \| \| `search Fails` \| 0 \| 0 \| N/A \| \| `search RT_avg` (ms) \| 11.34 \| 10.57 \| -6.78% \| \| `search RT_min` (ms) \| 1.39 \| 1.32 \| -5.18% \| \| `search RT_max` (ms) \| 349.72 \| 143.72 \| -58.91% \| \| `search TP50` (ms) \| 10.57 \| 9.93 \| -6.05% \| \| `search TP99` (ms) \| 26.14 \| 24.16 \| -7.56% \| \| `search RPS` \| 4,407 \| 4,729 \| +7.30% \| ### Key Findings PGO led to a notable enhancement in search performance, particularly in reducing the maximum response time by 58% and increasing the search QPS by 7.3%. ### Further Analysis Generate a diff flame graphs between two CPU profiles by running `go tool pprof -http=:8000 -diff_base nopgo.pgo pgo.pgo -normalize` <img width="1894" alt="goprofiling" src="https://github.com/milvus-io/milvus/assets/167743503/ab9e91eb-95c7-4963-acd9-d1c3c73ee010"> Further insight of HnswIndexNode and Milvus Search Handler <img width="1906" alt="hnsw" src="https://github.com/milvus-io/milvus/assets/167743503/a04cf4a0-7c97-4451-b3cf-98afc20a0b05"> <img width="1873" alt="search_handler" src="https://github.com/milvus-io/milvus/assets/167743503/5f4d3982-18dd-4115-8e76-460f7f534c7f"> After applying PGO to the Milvus server, the CPU utilization of the faiss::fvec_L2 function has decreased. This optimization significantly enhances the performance of the [HnswIndexNode::Search::searchKnn](`e0c9c41aa2/src/index/hnsw/hnsw.cc (L203)`) method, which is frequently invoked by Knowhere during high-concurrency searches. As the explanation from Go release notes, the function might be more aggressively inlined by Go compiler during the second build with the CPU profiling collected from the first run. As a result, the search handler efficiency within Milvus DataNode has improved, allowing the server to process a higher number of search queries per second (QPS). # Conclusion The combination of Go 1.21 and PGO has led to substantial enhancements in search performance for Milvus server, particularly in terms of search QPS and response times, making it more efficient for handling high-concurrency search operations. Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2024-05-22 13:21:39 +08:00
congqixia	7eeb120aab	enhance: Add lint rules for client pkg and fix problems (#33180 ) See also #31293 --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-05-20 20:47:38 +08:00
wei liu	f1c9986974	enhance: Skip return data distribution if no change happen (#32814 ) issue: #32813 --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-05-17 10:11:37 +08:00
aoiasd	cbbfb5b6d6	enhance: Add update milvus yaml command to makefile (#32857 ) Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2024-05-10 14:31:37 +08:00
yiwangdr	b1eacb2ae8	feat: datacoord/node watch based on rpc (#32036 ) issue: https://github.com/milvus-io/milvus/issues/25309 Signed-off-by: yiwangdr <yiwangdr@gmail.com>	2024-05-07 15:49:30 +08:00
SimFG	2944971507	enhance: fix the dev docker yaml and use `go install` to install the gotestsum (#32197 ) /kind improvement Signed-off-by: SimFG <bang.fu@zilliz.com>	2024-04-12 19:17:18 +08:00
SimFG	c012e6786f	feat: support rate limiter based on db and partition levels (#31070 ) issue: https://github.com/milvus-io/milvus/issues/30577 co-author: @jaime0815 --------- Signed-off-by: Patrick Weizhi Xu <weizhi.xu@zilliz.com> Signed-off-by: SimFG <bang.fu@zilliz.com> Co-authored-by: Patrick Weizhi Xu <weizhi.xu@zilliz.com>	2024-04-12 16:01:19 +08:00
congqixia	f399416b92	enhance: Use gotestsum to run go unit test (#31622 ) See also #31490 --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-03-27 15:29:10 +08:00
Bingyi Sun	0ac9bb4a9c	enhance: add mmap migration tool (#30909 ) issue: #30908 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2024-03-25 15:51:09 +08:00
congqixia	4da8b6607d	enhance: Add scripts to use `gotestsum` to execute integration test (#31490 ) See also #31489 --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-03-22 10:29:06 +08:00
Jiquan Long	375190e76e	fix: cpp format check not work (#30767 ) fix: #30765 --------- Signed-off-by: longjiquan <jiquan.long@zilliz.com>	2024-02-22 19:40:53 +08:00
zhagnlu	976b6fc0e4	enhance: change opendal as compile configurable (#30384 ) #30373 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2024-02-20 19:16:52 +08:00
congqixia	a68b32134a	fix: Verify sync task target segment and retry if not match (#30500 ) See also #27675 #30469 For a sync task, the segment could be compacted during sync task. In previous implementation, this sync task will hold only the old segment id as KeyLock, in which case compaction on compacted to segment may run in parallel with delta sync of this sync task. This PR introduces sync target segment verification logic. It shall check target segment lock it's holding beforing actually syncing logic. If this check failed, sync task shall return`errTargetSegementNotMatch` error and make manager re-fetch the current target segment id. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-02-05 11:33:43 +08:00
congqixia	5be909982d	enhance: add MockSerializer generation command into Makefile (#29713 ) See also #27675 Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-01-05 21:42:47 +08:00
XuanYang-cn	632d8b3743	enhance: Change DN channelmanger into interface (#29307 ) See also: #28854 Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2023-12-27 16:00:48 +08:00

1 2 3 4 5 ...

314 Commits