modify readme en version

This commit is contained in:
s_jqzhang 2019-02-18 18:15:13 +08:00
parent 3e5841d3da
commit 84ac54ab6b

View File

@ -1,14 +1,25 @@
# [中文](README.md) [English](README-en.md)
![logo](doc/logo.png)
# Distributed file server go-fastdfs (similar fastdfs) is better than fastdfs in terms of operation and maintenance management, more humane
# go-fastdfs is a distributed file system based on http protocol. It is based on the design concept of avenue to simple. All the simple design makes its operation and expansion more simple. It has high performance, high reliability and no center. , maintenance-free and so on.
- Support curl command upload
- Support browser upload
- Support HTTP download
- Support multi-machine automatic synchronization
- similar fastdfs
- Support breakpoint download
- Support configuration automatic generation
- Support small file automatic merge (reduce inode occupancy)
- Support for second pass
- Support for continued transmission ([tus](https://tus.io/))
- Support for docker deployment
- Support self-monitoring alarm
- Support cluster file information viewing
- Use the universal HTTP protocol
- No need for a dedicated client (support wget, curl, etc.)
- class fastdfs
- High performance (using leveldb as a kv library)
- High reliability (design is extremely simple, using mature components)
- No center design (all nodes can read and write at the same time)
# advantage
@ -24,7 +35,11 @@
- Support browser upload
- Support for viewing cluster file information
- Support cluster monitoring email alarm
- Support token download token=md5(file_md5+timestamp)
- Support small file automatic merge (reduce inode occupancy)
- Support for second pass
- Support for continued transmission ([tus](https://tus.io/))
- Support for docker deployment
- Support token download token = md5 (file_md5 + timestamp)
- Easy operation and maintenance, only one role (unlike fastdfs has three roles Tracker Server, Storage Server, Client), the configuration is automatically generated
- Peer-to-peer (simplified operation and maintenance)
- All nodes can read and write simultaneously
@ -37,34 +52,6 @@
```
# Configure automatic generation (conf/cfg.json)
```json
{
"addr": ":8080",
"peers": ["http://10.1.xx.2:8080", "http://10.1.xx.5:8080", "http://10.1.xx.60:8080"],
"group": "group1",
"refresh_interval": 120,
"rename_file": false,
"enable_web_upload": true,
"enable_custom_path": true,
"download_domain": "",
"scenes": [],
"default_scene": "default",
"show_dir": true,
"mail": {
"user": "abc@163.com",
"password": "abc",
"host": "smtp.163.com:25"
},
"alram_receivers": [],
"alarm_url": "",
"download_use_token": false,
"download_token_expire": 600,
"auto_repair": true
}
```
#Command upload
`curl -F file=@http-index-fs http://10.1.xx.60:8080/upload`
@ -75,7 +62,7 @@
`http://127.0.0.1:8080`
#Code upload (options see browser upload)
## python
```python
import requests
url = 'http://127.0.0.1:8080/upload'
@ -84,90 +71,201 @@ options={'output':'json','path':'','scene':''} #See browser upload options
r = requests.post(url, files=files)
print(r.text)
```
## golang
```go
package main
import (
"fmt"
"github.com/astaxie/beego/httplib"
)
func main() {
var obj interface{}
req:=httplib.Post("http://10.1.5.9:8080/upload")
req.PostFile("file","path/to/file")
req.Param("output","json")
req.Param("scene","")
req.Param("path","")
req.ToJSON(&obj)
fmt.Print(obj)
}
````
## java
```xml
<dependency>
<groupId>cn.hutool</groupId>
<artifactId>hutool-all</artifactId>
<version>4.4.3</version>
</dependency>
```
```java
public static void main(String[] args) {
File file = new File("D:\\git\\2.jpg");
HashMap<String, Object> paramMap = new HashMap<>();
paramMap.put("file", file);
paramMap.put("output","json");
paramMap.put("path","image");
paramMap.put("scene","image");
String result= HttpUtil.post("http://xxxxx:xxxx/upload", paramMap);
System.out.println(result);
}
```
# Breakpoint resume example
## golang
```go
package main
import (
"os"
"fmt"
"github.com/eventials/go-tus"
)
func main() {
f, err := os.Open("100m")
if err != nil {
panic(err)
}
defer f.Close()
// create the tus client.
client, err := tus.NewClient("http://10.1.5.9:8080/big/upload/", nil)
fmt.Println(err)
// create an upload from a file.
upload, err := tus.NewUploadFromFile(f)
fmt.Println(err)
// create the uploader.
uploader, err := client.CreateUpload(upload)
fmt.Println(err)
// start the uploading process.
fmt.Println( uploader.Upload())
}
[more langue](doc/upload.md)
![deploy](doc/go-fastdfs-deploy.png)
# Please click [Feedback](https://github.com/sjqzhang/go-fastdfs/issues/new)
# Q&A
- In the WeChat discussion group, everyone asked about the performance of go-fastdfs?
```
Because there are too many people asking, answer here in unison.
The file location of de-fastdfs is different from other distributed systems. Its addressing is directly located without any components, so the approximate time complexity is O(1)[file path location]
There is basically no performance loss. The project also has a pressure test script. You can carry out the pressure test yourself. Dont discuss the problem too much in the group. People reply to the same question every time.
Everyone will also feel that this group is boring.
```
- Can files already stored using fastdfs be migrated to go fastdfs?
```
The answer is yes, the problem you are worried about is the path change, go fastdfs considers this for you.
curl -F file=@data/00/00/_78HAFwyvN2AK6ChAAHg8gw80FQ213.jpg -F path=M00/00/00/ http://127.0.0.1:8080/upload
Curl -F file =@data/00/00/_78HAFwyvN2AK6ChAAHg8gw80FQ213.jpg -F path = M00 / 00/00 / http://127.0.0.1:8080/upload
Similarly, all files can be migrated with one line of command.
cd fastdfs/data && find -type f |xargs -n 1 -I {} curl -F file=@data/{} -F path=M00/00/00/ http://127.0.0.1:8080/
Cd fastdfs / data && find -type f | xargs -n 1 -I {} curl -F file = @ data / {} -F path = M00 / 00/00 / http://127.0.0.1:8080/
The above commands can be moved rough
Can write some simple scripts for migration
```
- What is a cluster, how to manage multiple clusters with Nginx?
```
1, in the go-fastdfs, a cluster is a group.
2, please refer to the deployment diagram
Note: When the support_group_manage parameter in the configuration is set to true, group information is automatically added to all urls.
For example: HTTP://10.1.5.9:8080/group/status
Default: HTTP://10.1.5.9:8080/status
The difference: more group, corresponding to the group parameter in the configuration, so mainly to solve a Nginx reverse proxy multiple groups (cluster)
Please refer to the deployment diagram for details.
```
- How to build a cluster?
```
First, download the compiled executable file (with the latest version)
Second, run the executable file (generate configuration)
Third, modify the configuration
Peer: increase the peer's HTTP address
an examination:
Moderator: Is the automatic generation correct?
Peer_id: Is it unique within the cluster?
Fourth, re-run the server
Five, verify that the service is OK
```
- Is it suitable for mass storage?
```
Answer: Suitable for mass storage
Special Note:
Need to use LevelDB as metadata storage, but not relying on lazy LevelDB,
And carry out more than 100 million documents for pressure measurement (you can use the script provided by the project to perform pressure measurement, and have problems and feedback to the problem in time).
100 million file metadata size is about 5G, export metadata text size 22G
```
- Need to install nginx yet?
```
Can not be installed, you can also choose to install
Go fastdfs itself is a high performance web file server.
```
- Can I dynamically load the configuration?
```
Answer: Yes, but update to the latest version
step:
1) Modify the conf / cfg.json file
2) Visit http://10.1.xx.60:8080 / reload
3) Note: each node needs to do the same operation
```
- What is the memory usage high?
```
Under normal circumstances, the memory should be lower than 2G, unless more than one million files are uploaded every day.
The memory is abnormal, mainly because the files of the cluster are not synchronized, and the automatic repair function is enabled.
Solution, delete the errors.md5 file on the day of the data directory, close the automatic repair, restart the service
See system status description
```
- How to view cluster file information?
```
Http://10.1.xx.60:8080/stat
HTTP://10.1.xx.60:8080 / STAT
What should I do if there is a file error?
Please delete the stat.json file in the data directory. Restart the service, please recalculate the number of files automatically.
Please delete the stat.json file in the data directory to restart the service. Please recalculate the number of files automatically.
Or call
HTTP://10.1.xx.60:8080 / repair_stat
```
- How reliable can it be used in a production environment?
```
This project has been used on a large scale in the production environment, such as fear of not meeting
You can stress test its features before use, any
The problem can be directly mentioned
The problem can directly ask the question
```
- Can I have multiple servers on one machine?
```
No, the high availability of the cluster has been considered at the beginning of the design. In order to ensure the true availability of the cluster, it must be a different ip.
Error "peers": ["http://127.0.0.1:8080","http://127.0.0.1:8081","http://127.0.0.1:8082"]
Correct "peers": ["http://10.0.0.3:8080","http://10.0.0.4:8080","http://10.0.0.5:8082"]
```
- What should I do if the files are not synchronized?
```
Under normal circumstances, the cluster automatically synchronizes the repair files every hour. (The performance is poor, it is recommended to turn off automatic repair in case of massive)
What about the abnormal situation?
Answer: Manual synchronization
Http://172.16.70.123:7080/sync?date=20190117&force=1
Parameter description: date indicates the data of the day of synchronization. force 1. indicates whether to force synchronization of all the day (poor performance), 0. means that only failed files are synchronized.
Unsynchronized situation:
1) Originally running N sets, now suddenly join one to become N+1
2) Originally running N sets, one machine has a problem and becomes N-1
```
- Does the file out of sync affect access?
```
Answer: It will not affect, it will automatically repair the files that are not synchronized when the access is not available.
```
- How to test?
```
First use gen_file.py to generate a large number of files (note that if you want to generate large files, you can multiply the content by a large number)
E.g:
# -*- coding: utf-8 -*-
import os
j=0
for i in range(0,1000000):
if i%1000==0:
j=i
os.system('mkdir %s'%(i))
with open('%s/%s.txt'%(j,i),'w+') as f:
f.write(str(i)*1024)
Then use benchmark.py for pressure measurement
It is also possible to perform pressure measurement simultaneously in multiple machines, and all nodes can be read and written simultaneously.
```
No, the high availability of the cluster has been considered at the beginning of the design. In order to ensure the true availability of the cluster, it must be different for ip, ip cannot use 127.0.0.1.
Error "peer": ["http://127.0.0.1:8080","http:// 127.0.0.1:8081","http:// 12.7.0.0.1:8082"]
Correct "peer": ["http://10.0.0.3:8080","http://10.0.0.4:8080","http://10.0.0.:8080"]
- If you have any questions, please click [Reply](https://github.com/sjqzhang/go-fastdfs/issues/new)