The milvus_benchmark is a non-functional testing tool or service which allows users to run tests on k8s cluster or at local, the primary use case is performance/load/stability testing, the objective is to expose problems in milvus project.
- helm: install the server by helm, which will manage the milvus in k8s cluster, and you can integrate the test stage into argo workflow or jenkins pipeline
The top level is the runner type: the other test types including: `search_performance/build_performance/insert_performance/accuracy/locust_insert/...`, each test type corresponds to the different runner component defined in directory `runnners`
There are several kinds of data types provided in benchmark:
1. Insert from `local`: random generated vectors
2. Insert from the file: the other data type such as `sift/deep`, the following list shows where the source data comes from, make sure to convert to `.npy` file format that can be loaded by `numpy`, and update the value of `RAW_DATA_DIR` in `config.py` to your own data path
There are also many optional datasets could be used to test milvus, here is the reference: http://big-ann-benchmarks.com/index.html
If the first few characters in the `collection_name` in test suite yaml are matched with the above type, the corresponding data will be created during inserting entities in milvus
Also, you should provide the field value of the source data file path `source_file` if running with `ann_accuracy` runner type, the source datasets could be found from https://github.com/erikbern/ann-benchmarks/, `SIFT/Kosarak/GloVe-200` are the datasets which are frequently used in regression testing for milvus
The test result can be used to analyze the regression or improvement of the milvus system, so we upload the metrics of the test result when a test suite run finished, and then use `redash` to make sense of our data
The `env` component defines the server environment and environment management, the instance of the `env` corresponds to the run mode of the benchmark
-`local`: Only defines the host and port for testing
-`helm/docker`: Install and uninstall the server in benchmark stage
-`runner`
The actual executor in benchmark, each test type defined in test suite will generate the corresponding runner instance, there are three stages in `runner`:
-`extract_cases`: There are several test cases defined in each test suite yaml, and each case shares the same server environment and shares the same `prepare` stage, but the `metric` for each case is different, so we need to extract cases from the test suite before the cases runs
-`prepare`: Prepare the data and operations, for example, before running searching, index needs to be created and data needs to be loaded
-`run_case`: Do the core operation and set `metric` value
-`suites`: There are two ways to take the content to be tested as input parameters:
- Test suite files under `suites` directory
- Test suite configmap name including `server_config_map` and `client_config_map` if using argo workflow
-`update.py`: While using argo workflow as benchmark pipeline, we have two steps in workflow template: `install-milvus` and `client-test`
- In stage `install-milvus`, `update.py` is used to generate a new `values.yaml` which will be a param while in `helm install` operation
- In stage `client-test`, it runs `main.py` and receives the milvus host and port as the cmd params, with the run mode `local`
### Conceptual overview
The following diagram shows the runtime execution graph of the benchmark (local mode based on argo workflow)
As the metrics uploaded to the db (we use MongoDB currently), we suppose use Redash to visualize test result from https://redash.io/.
For example, in order to find the most suitable insert batch size when preparing data with milvus, a benchmark test suite type named `bp_insert_performance` will run regularly, different `ni_per` in this suite yaml will be executed and the average response time and TPS (Number of rows inserted per second) will be collected.