milvus

mirror of https://gitee.com/milvus-io/milvus.git synced 2024-11-30 02:48:45 +08:00

Author	SHA1	Message	Date
Ted Xu	63f0154dfb	fix: enable milvus.yaml check (#34567 ) See #32168 --------- Signed-off-by: Ted Xu <ted.xu@zilliz.com>	2024-07-25 18:05:46 +08:00
smellthemoon	5616b7e8d2	enhance: support null in c data_datacodec and load null value (#32183 ) 1. support read and write null in segcore will store valid_data(use uint8_t type to save memory) in fieldData. 2. support load null binlog reader read and write data into column(sealed segment), insertRecord(growing segment). In sealed segment, store valid_data directly. In growing segment, considering prior implementation and easy code reading, it covert uint8_t to fbvector<bool>, which may optimize in future. 3. retrieve valid_data. parse valid_data in search/query. #31728 --------- Signed-off-by: lixinguo <xinguo.li@zilliz.com> Co-authored-by: lixinguo <xinguo.li@zilliz.com>	2024-07-23 16:07:51 +08:00
wayblink	c2b8b5fe84	enhance: refine clustering compaction configs and logs (#34784 ) #30633 --------- Signed-off-by: wayblink <anyang.wang@zilliz.com>	2024-07-21 19:23:40 +08:00
yihao.dai	b22e549844	enhance: Rename config of sealing by growing segmetns size (#34787 ) /kind improvement --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-07-19 20:27:41 +08:00
chyezh	86eff6e589	enhance: streaming node client implementation (#34653 ) issue: #33285 - add streaming node grpc client wrapper - add unittest for streaming node grpc client side - fix binary unsafe bug for message --------- Signed-off-by: chyezh <chyezh@outlook.com>	2024-07-19 17:37:40 +08:00
wayblink	c79d1af390	enhance: Add compaction task slot usage logic (#34581 ) #34544 Signed-off-by: wayblink <anyang.wang@zilliz.com>	2024-07-18 10:27:41 +08:00
yihao.dai	4939f82d4f	enhance: Seal by total growing segments size (#34692 ) Seals the largest growing segment if the total size of growing segments of each shard exceeds the size threshold(default 4GB). Introducing this policy can help keep the size of growing segments within a suitable level, alleviating the pressure on the delegator. issue: https://github.com/milvus-io/milvus/issues/34554 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-07-17 21:45:41 +08:00
SimFG	203fb554a4	enhance: support to config root user's password (#34752 ) - issue: #33058 Signed-off-by: SimFG <bang.fu@zilliz.com>	2024-07-17 20:19:42 +08:00
congqixia	8b5754f7fe	enhance: Add segment seal proportion jitter (#34636 ) See also #34574 Add jitter for segment seal proportion to avoid seal operation burst in short period of time. This PR also fix license header in paramtable pkg. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-07-15 14:47:39 +08:00
congqixia	1a248f2668	enhance: Add param item for segmentFlushInterval (#34629 ) See also #28817 Add paramitem for segment flush interval Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-07-12 18:59:35 +08:00
aoiasd	c1e04534c3	enhance:change access log write cache default config (#34354 ) Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2024-07-09 18:18:12 +08:00
jaime	21fc5f5d46	enhance: Remove datanode reporting TT based on MQ implementation (#34421 ) issue: #34420 Signed-off-by: jaime <yun.zhang@zilliz.com>	2024-07-05 15:48:09 +08:00
XuanYang-cn	8a2be8a457	fix: DataNode might OOM by estimating based on MemorySize (#34201 ) See also: #34136 --------- Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2024-07-04 10:16:17 +08:00
jaime	d6afb31b94	enhance: make subfunctions of datanode component modular (#33992 ) issue: #33994 also remove deprecated channel manager based on the etcd implementation Signed-off-by: jaime <yun.zhang@zilliz.com>	2024-07-01 14:46:07 +08:00
XuanYang-cn	dda70aa81b	fix: LegacyVersionWithoutRPCWatch default value to 2.4.1 (#34184 ) See also: #31933 --------- Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2024-06-27 19:36:06 +08:00
wayblink	f9a0f7bb25	Add an option to enable/disable vector field clustering key (#34097 ) #30633 Signed-off-by: wayblink <anyang.wang@zilliz.com>	2024-06-25 18:52:04 +08:00
jaime	d08cb885ca	enhance: enable flush rate limiter of collection level (#33837 ) Signed-off-by: jaime <yun.zhang@zilliz.com>	2024-06-24 10:52:03 +08:00
congqixia	cc77363b66	enhance: Set maxPartitionNum default value to 1024 (#33949 ) See also #30059 --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-06-20 09:11:59 +08:00
chyezh	2f6f964bc8	enhance: [skip e2e] modify gc configuration document (#33946 ) issue: #31740 Signed-off-by: chyezh <chyezh@outlook.com>	2024-06-19 09:57:59 +08:00
cqy123456	32f685ff12	enhance: growing segment support mmap (#32633 ) issue: https://github.com/milvus-io/milvus/issues/32984 Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>	2024-06-18 14:42:00 +08:00
XuanYang-cn	1629833060	enhance: Add consts of MsgDispatcher to configs (#33679 ) See also: #33676 --------- Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2024-06-07 14:21:59 +08:00
SimFG	ecee7d90d4	enhance: try to speed up the loading of small collections (#33570 ) - issue: #33569 Signed-off-by: SimFG <bang.fu@zilliz.com>	2024-06-07 08:25:53 +08:00
cai.zhang	27cc9f2630	enhance: Support analyze data (#33651 ) issue: #30633 Signed-off-by: Cai Zhang <cai.zhang@zilliz.com> Co-authored-by: chasingegg <chao.gao@zilliz.com>	2024-06-06 17:37:51 +08:00
cai.zhang	77637180fa	enhance: Periodically synchronize segments to datanode watcher (#33420 ) issue: #32809 --------- Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2024-05-30 13:37:44 +08:00
yihao.dai	bbb69980ac	enhance: Replace 'off' with 'disable' (#33433 ) YAML will automatically parse "off" as a boolean variable. We should avoid using "off" in the future. issue: https://github.com/milvus-io/milvus/issues/32772 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-05-29 12:17:43 +08:00
Bingyi Sun	6b3e42f8d8	fix: fix wrong default local storage path (#33389 ) issue: https://github.com/milvus-io/milvus/issues/33427 Signed-off-by: sunby <sunbingyi1992@gmail.com>	2024-05-28 16:49:43 +08:00
aoiasd	59a7a46904	enhance: Merge query stream result for reduce delete task (#32855 ) relate: https://github.com/milvus-io/milvus/issues/32854 --------- Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2024-05-27 18:15:43 +08:00
yihao.dai	760223f80a	fix: use seperate warmup pool and disable warmup by default (#33348 ) 1. use a small warmup pool to reduce the impact of warmup 2. change the warmup pool to nonblocking mode 3. disable warmup by default 4. remove the maximum size limit of 16 for the load pool issue: https://github.com/milvus-io/milvus/issues/32772 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com> Co-authored-by: xiaofanluan <xiaofan.luan@zilliz.com>	2024-05-27 01:25:40 +08:00
congqixia	5cdc6ae489	enhance: Sync `deleteBufBytes` config value to default config (#33320 ) The delete buffer size is set to 64MB in milvus.yaml but the default set up shall be 16MB Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-05-24 10:41:40 +08:00
aoiasd	1b4e28b97f	enhance: Check by proxy rate limiter when delete get data by query. (#30891 ) relate: https://github.com/milvus-io/milvus/issues/30927 --------- Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2024-05-23 20:03:40 +08:00
wei liu	c7be2ce33a	enhance: Decrease bloom filter fp rate to reduce delete impact (#33301 ) when milvus process delete record, it need to find record's corresponded segment by bloom filter, and higher bloom filter fp rate will cause delete record forwards to wrong segments. This PR Decrease bloom filter's default fp to 0.001. Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-05-23 18:15:41 +08:00
shaoting-huang	de7901121f	Upgrade go from 1.20 to 1.21 (#33047 ) Signed-off-by: shaoting-huang [shaoting-huang@zilliz.com] issue: https://github.com/milvus-io/milvus/issues/32982 # Background Go 1.21 introduces several improvements and changes over Go 1.20, which is quite stable now. According to [Go 1.21 Release Notes](https://tip.golang.org/doc/go1.21), the big difference of Go 1.21 is enabling Profile-Guided Optimization by default, which can improve performance by around 2-14%. Here are the summary steps of PGO: 1. Build Initial Binary (Without PGO) 2. Deploying the Production Environment 3. Run the program and collect Performance Analysis Data (CPU pprof) 4. Analyze the Collected Data and Select a Performance Profile for PGO 5. Place the Performance Analysis File in the Main Package Directory and Name It default.pgo 6. go build Detects the default.pgo File and Enables PGO 7. Build and Release the Updated Binary (With PGO) 8. Iterate and Repeat the Above Steps <img width="657" alt="Screenshot 2024-05-14 at 15 57 01" src="https://github.com/milvus-io/milvus/assets/167743503/b08d4300-0be1-44dc-801f-ce681dabc581"> # What does this PR do There are three experiments, search benchmark by Zilliz test platform, search benchmark by open-source [VectorDBBench](https://github.com/zilliztech/VectorDBBench?tab=readme-ov-file), and search benchmark with PGO. We do both search benchmarks by Zilliz test platform and by VectorDBBench to reduce reliance on a single experimental result. Besides, we validate the performance enhancement with PGO. ## Search Benchmark Report by Zilliz Test Platform An upgrade to Go 1.21 was conducted on a Milvus Standalone server, equipped with 16 CPUs and 64GB of memory. The search performance was evaluated using a 1 million entry local dataset with an L2 metric type in a 768-dimensional space. The system was tested for concurrent searches with 50 concurrent tasks for 1 hour, each with a 20-second interval. The reason for using one server rather than two servers to compare is to guarantee the same data source and same segment state after compaction. Test Sequence: 1. Go 1.20 Initial Run: Insert data, build index, load index, and search. 2. Go 1.20 Rebuild: Rebuild the index with the same dataset, load index, and search. 3. Go 1.21 Load: Upload to Go 1.21 within the server. Then load the index from the second run, and search. 4. Go 1.21 Rebuild: Rebuild the index with the same dataset, load index, and search. Search Metrics: \| Metric \| Go 1.20 \| Go 1.20 Rebuild Index \| Go 1.21 \| Go 1.21 Rebuild Index \| \|----------------------------\|------------------\|-----------------\|------------------\|-----------------\| \| `search requests` \| 10,942,683 \| 16,131,726 \| 16,200,887 \| 16,331,052 \| \| `search fails` \| 0 \| 0 \| 0 \| 0 \| \| `search RT_avg` (ms) \| 16.44 \| 11.15 \| 11.11 \| 11.02 \| \| `search RT_min` (ms) \| 1.30 \| 1.28 \| 1.31 \| 1.26 \| \| `search RT_max` (ms) \| 446.61 \| 233.22 \| 235.90 \| 147.93 \| \| `search TP50` (ms) \| 11.74 \| 10.46 \| 10.43 \| 10.35 \| \| `search TP99` (ms) \| 92.30 \| 25.76 \| 25.36 \| 25.23 \| \| `search RPS` \| 3,039 \| 4,481 \| 4,500 \| 4,536 \| ### Key Findings The benchmark tests reveal that the index build time with Go 1.20 at 340.39 ms and Go 1.21 at 337.60 ms demonstrated negligible performance variance in index construction. However, Go 1.21 offers slightly better performance in search operations compared to Go 1.20, with improvements in handling concurrent tasks and reducing response times. ## Search Benchmark Report By VectorDb Bench Follow [VectorDBBench](https://github.com/zilliztech/VectorDBBench?tab=readme-ov-file) to create a VectorDb Bench test for Go 1.20 and Go 1.21. We test the search performance with Go 1.20 and Go 1.21 (without PGO) on the Milvus Standalone system. The tests were conducted using the Cohere dataset with 1 million entries in a 768-dimensional space, utilizing the COSINE metric type. Search Metrics: Metric \| Go 1.20 \| Go 1.21 without PGO -- \| -- \| -- Load Duration (seconds) \| 1195.95 \| 976.37 Queries Per Second (QPS) \| 841.62 \| 875.89 99th Percentile Serial Latency (seconds) \| 0.0047 \| 0.0076 Recall \| 0.9487 \| 0.9489 ### Key Findings Go 1.21 indicates faster index loading times and larger search QPS handling. ## PGO Performance Test Milvus has already added [net/http/pprof](https://pkg.go.dev/net/http/pprof) in the metrics. So we can curl the CPU profile directly by running `curl -o default.pgo "http://${MILVUS_SERVER_IP}:${MILVUS_SERVER_PORT}/debug/pprof/profile?seconds=${TIME_SECOND}"` to collect the profile as the default.pgo during the first search. Then I build Milvus with PGO and use the same index to run the search again. The result is as below: Search Metrics \| Metric \| Go 1.21 Without PGO \| Go 1.21 With PGO \| Change (%) \| \|---------------------------------------------\|------------------\|-----------------\|------------\| \| `search Requests` \| 2,644,583 \| 2,837,726 \| +7.30% \| \| `search Fails` \| 0 \| 0 \| N/A \| \| `search RT_avg` (ms) \| 11.34 \| 10.57 \| -6.78% \| \| `search RT_min` (ms) \| 1.39 \| 1.32 \| -5.18% \| \| `search RT_max` (ms) \| 349.72 \| 143.72 \| -58.91% \| \| `search TP50` (ms) \| 10.57 \| 9.93 \| -6.05% \| \| `search TP99` (ms) \| 26.14 \| 24.16 \| -7.56% \| \| `search RPS` \| 4,407 \| 4,729 \| +7.30% \| ### Key Findings PGO led to a notable enhancement in search performance, particularly in reducing the maximum response time by 58% and increasing the search QPS by 7.3%. ### Further Analysis Generate a diff flame graphs between two CPU profiles by running `go tool pprof -http=:8000 -diff_base nopgo.pgo pgo.pgo -normalize` <img width="1894" alt="goprofiling" src="https://github.com/milvus-io/milvus/assets/167743503/ab9e91eb-95c7-4963-acd9-d1c3c73ee010"> Further insight of HnswIndexNode and Milvus Search Handler <img width="1906" alt="hnsw" src="https://github.com/milvus-io/milvus/assets/167743503/a04cf4a0-7c97-4451-b3cf-98afc20a0b05"> <img width="1873" alt="search_handler" src="https://github.com/milvus-io/milvus/assets/167743503/5f4d3982-18dd-4115-8e76-460f7f534c7f"> After applying PGO to the Milvus server, the CPU utilization of the faiss::fvec_L2 function has decreased. This optimization significantly enhances the performance of the [HnswIndexNode::Search::searchKnn](`e0c9c41aa2/src/index/hnsw/hnsw.cc (L203)`) method, which is frequently invoked by Knowhere during high-concurrency searches. As the explanation from Go release notes, the function might be more aggressively inlined by Go compiler during the second build with the CPU profiling collected from the first run. As a result, the search handler efficiency within Milvus DataNode has improved, allowing the server to process a higher number of search queries per second (QPS). # Conclusion The combination of Go 1.21 and PGO has led to substantial enhancements in search performance for Milvus server, particularly in terms of search QPS and response times, making it more efficient for handling high-concurrency search operations. Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2024-05-22 13:21:39 +08:00
yihao.dai	32560263fa	enhance: Query slot for compaction task (#32881 ) Query slot of compaction in datanode, and transfer the control logic for limiting compaction tasks from datacoord to the datanode. issue: https://github.com/milvus-io/milvus/issues/32809 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-05-17 18:19:38 +08:00
wei liu	cba2c7a3be	enhance: clean channel node info in meta store (#32988 ) issue: #32910 see also: #32911 when channel exclusive mode is enabled, replica will record channel node info in meta store, and if the balance policy changes, which means channel exclusive mode is disabled, we should clean up the channel node info in meta store, and stop to balance node between channels. Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-05-14 10:05:40 +08:00
foxspy	f6777267e3	enhance: add score compute consistency config for knowhere (#32997 ) issue: https://github.com/milvus-io/milvus/issues/32583 related: #32584 Signed-off-by: xianliang.li <xianliang.li@zilliz.com>	2024-05-13 14:21:31 +08:00
Bingyi Sun	4724779b3b	enhance: remove fallback keys for config generator (#32946 ) Signed-off-by: sunby <sunbingyi1992@gmail.com>	2024-05-13 13:33:31 +08:00
yiwangdr	855192eb3d	fix: sync milvus.yaml (#32920 ) issue: https://github.com/milvus-io/milvus/issues/25309 Signed-off-by: yiwangdr <yiwangdr@gmail.com>	2024-05-10 17:29:31 +08:00
aoiasd	54a51b1236	enhance: Support dynamic config for opentelemetry trace (#32169 ) relate: https://github.com/milvus-io/milvus/issues/31940 --------- Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2024-05-09 17:43:30 +08:00
chyezh	641f702f64	fix: add request resource timeout for lazy load, refactor context usage in cache (#32709 ) issue: #32663 - Use new param to control request resource timeout for lazy load. - Remove the timeout parameter of `Do`, remove `DoWait`. use `context` to control the timeout. - Use `VersionedNotifier` to avoid notify event lost and broadcast, remove the redundant goroutine in cache. related dev pr: #32684 Signed-off-by: chyezh <chyezh@outlook.com>	2024-05-07 16:33:30 +08:00
Bingyi Sun	fecd9c21ba	feat: LRU cache implementation (#32567 ) issue: https://github.com/milvus-io/milvus/issues/32783 This pr is the implementation of lru cache on branch lru-dev. Signed-off-by: sunby <sunbingyi1992@gmail.com> Co-authored-by: chyezh <chyezh@outlook.com> Co-authored-by: MrPresent-Han <chun.han@zilliz.com> Co-authored-by: Ted Xu <ted.xu@zilliz.com> Co-authored-by: jaime <yun.zhang@zilliz.com> Co-authored-by: wayblink <anyang.wang@zilliz.com>	2024-05-06 20:29:30 +08:00
chyezh	2586c2f1b3	enhance: use WalkWithPrefix api for oss, enable piplined file gc (#31740 ) issue: #19095,#29655,#31718 - Change `ListWithPrefix` to `WalkWithPrefix` of OOS into a pipeline mode. - File garbage collection is performed in other goroutine. - Segment Index Recycle clean index file too. --------- Signed-off-by: chyezh <chyezh@outlook.com>	2024-04-25 20:41:27 +08:00
Ted Xu	744a54a534	enhance: enforce milvus.yaml assertion in UT (#32357 ) See #32168 Signed-off-by: Ted Xu <ted.xu@zilliz.com>	2024-04-19 16:47:20 +08:00
Ted Xu	78d32bd8b2	enhance: update milvus.yaml (#31832 ) See #32168 --------- Signed-off-by: Ted Xu <ted.xu@zilliz.com>	2024-04-16 16:17:19 +08:00
edward.zeng	b7ff85638d	fix: mvcc database space exceeded for embed etcd (#32048 ) Fix #30314 Signed-off-by: Edward Zeng <jie.zeng@zilliz.com>	2024-04-12 21:39:19 +08:00
jaime	371e6d2c1a	enhance: refine sync memory watermark configuration (#32140 ) Signed-off-by: jaime <yun.zhang@zilliz.com>	2024-04-11 20:07:24 +08:00
yihao.dai	49d109de18	enhance: Use an individual buffer size parameter for imports (#31833 ) Use an individual buffer size parameter for imports and set buffer size to 64MB. issue: https://github.com/milvus-io/milvus/issues/28521 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-04-08 21:07:18 +08:00
yihao.dai	4e264003bf	enhance: Ensure ImportV2 waits for the index to be built and refine some logic (#31629 ) Feature Introduced: 1. Ensure ImportV2 waits for the index to be built Enhancements Introduced: 1. Utilization of local time for timeout ts instead of allocating ts from rootcoord. 3. Enhanced input file length check for binlog import. 4. Removal of duplicated manager in datanode. 5. Renaming of executor to scheduler in datanode. 6. Utilization of a thread pool in the scheduler in datanode. issue: https://github.com/milvus-io/milvus/issues/28521 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-04-01 20:09:13 +08:00
Bingyi Sun	fbff46a005	enhance: add lazyload global config (#31610 ) Signed-off-by: sunby <sunbingyi1992@gmail.com>	2024-03-27 20:23:10 +08:00
groot	5be395354c	fix: minio ssl compatible issue (#31607 ) issue: https://github.com/milvus-io/milvus/issues/30709 Signed-off-by: yhmo <yihua.mo@zilliz.com>	2024-03-27 14:41:20 +08:00
presburger	fe1961ff14	enhance: add comments for gpu mem pool setting (#31231 ) Signed-off-by: yusheng.ma <yusheng.ma@zilliz.com>	2024-03-25 14:41:07 +08:00

1 2 3 4 5 ...

575 Commits