mirror of
https://gitee.com/milvus-io/milvus.git
synced 2024-12-02 20:09:57 +08:00
548c82eca5
Benchmark Milvus with https://github.com/qdrant/vector-db-benchmark and specify the datasets as 'deep-image-96-angular'. Meanwhile, do perf profiling during 'upload + index' stage of vector-db-benchmark and see the following hot spots. 39.59%--github.com/milvus-io/milvus/internal/storage.MergeInsertData | |--21.43%--github.com/milvus-io/milvus/internal/storage.MergeFieldData | | | |--17.22%--runtime.memmove | | | |--1.53%--asm_exc_page_fault | ...... | |--18.16%--runtime.memmove | |--1.66%--asm_exc_page_fault ...... The hot code path is in storage.MergeInsertData() which updates buffer.buffer by creating a new 'InsertData' instance and merging both the old buffer.buffer and addedBuffer into it. When it calls golang runtime.memmove to move buffer.buffer which is with big size (>1M), the hot spots appear. To avoid the above overhead, update storage.MergeInsertData() by appending addedBuffer to buffer.buffer, instead of moving buffer.buffer and addedBuffer to a new 'InsertData'. This change removes the hot spots 'runtime.memmove' from perf profiling output. Additionally, the 'upload + index' time, which is one performance metric of vector-db-benchmark, is reduced around 60% with this change. Signed-off-by: Cathy Zhang <cathy.zhang@intel.com> |
||
---|---|---|
.. | ||
allocator | ||
binlog_io_test.go | ||
binlog_io.go | ||
buffer_test.go | ||
buffer.go | ||
cache_test.go | ||
cache.go | ||
channel_meta_test.go | ||
channel_meta.go | ||
compaction_executor_test.go | ||
compaction_executor.go | ||
compactor_test.go | ||
compactor.go | ||
data_node_test.go | ||
data_node.go | ||
data_sync_service_test.go | ||
data_sync_service.go | ||
event_manager_test.go | ||
event_manager.go | ||
flow_graph_dd_node_test.go | ||
flow_graph_dd_node.go | ||
flow_graph_delete_node_test.go | ||
flow_graph_delete_node.go | ||
flow_graph_dmstream_input_node_test.go | ||
flow_graph_dmstream_input_node.go | ||
flow_graph_insert_buffer_node_test.go | ||
flow_graph_insert_buffer_node.go | ||
flow_graph_manager_test.go | ||
flow_graph_manager.go | ||
flow_graph_message_test.go | ||
flow_graph_message.go | ||
flow_graph_node.go | ||
flow_graph_time_tick_node.go | ||
flow_graph_time_ticker.go | ||
flush_manager_test.go | ||
flush_manager.go | ||
flush_task_test.go | ||
flush_task.go | ||
io_pool_test.go | ||
io_pool.go | ||
meta_service_test.go | ||
meta_service.go | ||
meta_util.go | ||
metrics_info.go | ||
mock_test.go | ||
OWNERS | ||
rate_collector_test.go | ||
rate_collector.go | ||
README.md | ||
segment_sync_policy_test.go | ||
segment_sync_policy.go | ||
segment_test.go | ||
segment.go | ||
services_test.go | ||
services.go | ||
timetick_sender_test.go | ||
timetick_sender.go | ||
util.go |
Data Node
DataNode is the component to write insert and delete messages into persistent blob storage, for example MinIO or S3.
Dependency
- KV store: a kv store that persists messages into blob storage.
- Message stream: receive messages and publish imformation
- Root Coordinator: get the latest unique IDs.
- Data Coordinator: get the flush information and which message stream to subscribe.