### Download the model and place it in the models directory
- Link 1:
- Link 2:
### Cross-modal similarity comparison and retrieval for images and text [supports 40 languages]
This example demonstrates the ability to search for images through text (the model itself also supports searching for text through images, or mixed search).
### Main Features
- Uses feature vector similarity search at the bottom level
- Millisecond-level search for billions of data on a single server
- Near-real-time search, supports distributed deployment
- Can insert, delete, search, update, and other operations on data at any time
### Background introduction
OpenAI has released two new neural networks: CLIP and DALL·E. They combine NLP (natural language recognition) with image recognition and have a better understanding of images and language in daily life.
Previously, text was used to search for text, and images were used to search for images. Now, with the CLIP model, text can be used to search for images, and images can be used to search for text. The implementation idea is to map images and text to the same vector space. In this way, cross-modal similarity comparison and retrieval of images and text can be realized.
- Feature vector space (composed of images and text)
### CLIP - "Alternative" Image Recognition
Currently, most models learn to recognize images from labeled examples in labeled datasets, while CLIP learns to recognize images and their descriptions obtained from the Internet, thus understanding images through a description rather than word labels such as "cat" and "dog."
To do this, CLIP learns to associate a large number of objects with their names and descriptions, and can thus recognize objects outside the training set.
As shown in the figure above, the CLIP network workflow: pre-trains the image encoder and text encoder to predict which images are paired with which text in the dataset.
Then, CLIP is converted to a zero-shot classifier. In addition, all classifications in the dataset are converted into labels such as "a photo of a dog" and the best match image is predicted.
CLIP model address:
#### Supported language list:
### 1. Front-end deployment
### 1.1 Run directly:
npm run dev
#### 1.2 Build the dist installation package:
npm run build:prod
#### 1.3 Nginx deployment and operation (mac environment as an example):
cd /usr/local/etc/nginx/
vi /usr/local/etc/nginx/nginx.conf
# edit nginx.conf
server {
listen 8080;
server_name localhost;
location / {
root /Users/calvin/Documents/image_text_search/dist/;
index index.html index.htm;
# Reload the configuration:
sudo nginx -s reload
# After deploying the application, restart it:
cd /usr/local/Cellar/nginx/1.19.6/bin
# Fast stop
sudo nginx -s stop
# Start
sudo nginx
### 2. Back-end jar deployment
### 2.1 Environment requirements:
- System JDK 1.8+
- application.yml
1). Edit the image upload root path rootPath according to your needs
# 文件存储路径
path: ~/file/
# folder for unzip files
rootPath: ~/file/data_root/
path: /home/aias/file/
rootPath: /home/aias/file/data_root/
path: file:/D:/aias/file/
rootPath: file:/D:/aias/file/data_root/
2). Edit the image baseurl according to your needs
#baseurl is the address prefix of the image
baseurl: <>
#### 2.2 Running the program:
# run the program
java -jar image-text-search-0.1.0.jar
### 3. Backend vector engine deployment
#### 3.1 Environmental requirements:
-Need to install docker operating environment, Mac environment can use Docker Desktop
#### 3.2 Pull Milvus vector engine image (used to calculate feature vector similarity)
Download the milvus-standalone-docker-compose.yml configuration file and save it as docker-compose.yml
[Standalone Installation Document](
[Configure Milvus](
##### Please refer to the latest version of the official website
- Milvus vector engine reference link
[Milvus vector engine official website](
[Milvus vector engine Github](
# example - v2.2.4
wget -O docker-compose.yml
#### 3.3 Start Docker container
sudo docker-compose up -d
#### 3.4 Edit vector engine connection configuration information
- application.yml
- Edit the vector engine connection IP address to the IP of the host where the container is located as needed
################## Vector engine ################
port: 19530
### 4. Open the browser
- Enter the address: [http://localhost:8090](http://localhost:8090/)
### 4.1 Image upload
1). Click the upload button to upload the file.
[Test image data](
2). Click the feature extraction button.
Wait for the image feature extraction and store it in the vector engine. The progress information can be seen through the console.
### 4.2 Cross-modal search-text search image
Enter the text description and click the query button to see the returned list of images sorted by similarity.
- Example 1, enter text: car
- Example 2, enter text: two dogs on the snow
#### 4.3 Cross-modal search-image search image
### 5. Help information
- swagger interface document:
- Initialize the vector engine (clear data):
- Milvus vector engine reference link
[Milvus vector engine official website](
[Milvus vector engine Github](

### 目录:
### 下载模型放置于models目录
- 链接:
### 图像&文本的跨模态相似性比对检索【支持40种语言】
#### 主要特性
- 底层使用特征向量相似度搜索
- 单台服务器十亿级数据的毫秒级搜索
- 近实时搜索,支持分布式部署
- 随时对数据进行插入、删除、搜索、更新等操作
#### 背景介绍
OpenAI 发布了两个新的神经网络CLIP 和 DALL·E。它们将 NLP自然语言识别与 图像识别结合在一起,对日常生活中的图像和语言有了更好的理解。
- 特征向量空间(由图片 & 文本组成)
#### CLIP - “另类”的图像识别
目前,大多数模型学习从标注好的数据集的带标签的示例中识别图像,而 CLIP 则是学习从互联网获取的图像及其描述, 即通过一段描述而不是“猫”、“狗”这样的单词标签来认识图像。
为了做到这一点CLIP 学习将大量的对象与它们的名字和描述联系起来,并由此可以识别训练集以外的对象。
如上图所示CLIP网络工作流程 预训练图编码器和文本编码器,以预测数据集中哪些图像与哪些文本配对。
#### 支持的语言列表:
### 1. 前端部署
#### 1.1 直接运行:
npm run dev
#### 1.2 构建dist安装包
npm run build:prod
#### 1.3 nginx部署运行(mac环境为例)
cd /usr/local/etc/nginx/
vi /usr/local/etc/nginx/nginx.conf
# 编辑nginx.conf
server {
listen 8080;
server_name localhost;
location / {
root /Users/calvin/Documents/image_text_search/dist/;
index index.html index.htm;
# 重新加载配置:
sudo nginx -s reload
# 部署应用后,重启:
cd /usr/local/Cellar/nginx/1.19.6/bin
# 快速停止
sudo nginx -s stop
# 启动
sudo nginx
### 2. 后端jar部署
#### 2.1 环境要求:
- 系统JDK 1.8+
- application.yml
1). 根据需要编辑图片上传根路径rootPath
# 文件存储路径
path: ~/file/
# folder for unzip files
rootPath: ~/file/data_root/
path: /home/aias/file/
rootPath: /home/aias/file/data_root/
path: file:/D:/aias/file/
rootPath: file:/D:/aias/file/data_root/
2). 根据需要编辑图片baseurl
#### 2.2 运行程序:
# 运行程序
java -jar image-text-search-0.1.0.jar
### 3. 后端向量引擎部署
#### 3.1 环境要求:
- 需要安装docker运行环境Mac环境可以使用Docker Desktop
#### 3.2 拉取Milvus向量引擎镜像用于计算特征值向量相似度
下载 milvus-standalone-docker-compose.yml 配置文件并保存为 docker-compose.yml
# 例子v2.2.4,请根据官方文档,选择合适的版本
wget -O docker-compose.yml
#### 3.3 启动 Docker 容器
sudo docker-compose up -d
#### 3.5 编辑向量引擎连接配置信息
- application.yml
- 根据需要编辑向量引擎连接ip地址127.0.0.1为容器所在的主机ip
################## 向量引擎 ################
port: 19530
### 4. 打开浏览器
- 输入地址: http://localhost:8090
#### 4.1 图片上传
1). 点击上传按钮上传文件.
2). 点击特征提取按钮.
#### 4.2 跨模态搜索 - 文本搜图
- 例子1输入文本
- 例子2输入文本雪地上两只狗
#### 4.3 跨模态搜索 - 以图搜图
### 5. 帮助信息
- swagger接口文档:
- 初始化向量引擎(清空数据):
- Milvus向量引擎参考链接
### 官网:
### Git地址
#### 帮助文档:
- 1.性能优化常见问题:
- 2.引擎配置包括CPUGPU在线自动加载及本地配置:
- 3.模型加载方式(在线自动加载,及本地配置):
- 4.Windows环境常见问题:

# just a flag
ENV = 'development'
# base api

View File

@ -1,6 +0,0 @@
# just a flag
ENV = 'production'
# base api

View File

