diff --git a/.github/PULL_REQUEST_TEMPLATE.md b/.github/PULL_REQUEST_TEMPLATE.md new file mode 100644 index 0000000000..6235a6ce84 --- /dev/null +++ b/.github/PULL_REQUEST_TEMPLATE.md @@ -0,0 +1,32 @@ +## *Tips* +- *Thanks very much for contributing to Apache DolphinScheduler.* +- *Please review https://dolphinscheduler.apache.org/en-us/community/index.html before opening a pull request.* + +## What is the purpose of the pull request + +*(For example: This pull request adds checkstyle plugin.)* + +## Brief change log + +*(for example:)* + - *Add maven-checkstyle-plugin to root pom.xml* + +## Verify this pull request + +*(Please pick either of the following options)* + +This pull request is code cleanup without any test coverage. + +*(or)* + +This pull request is already covered by existing tests, such as *(please describe tests)*. + +(or) + +This change added tests and can be verified as follows: + +*(example:)* + + - *Added dolphinscheduler-dao tests for end-to-end.* + - *Added CronUtilsTest to verify the change.* + - *Manually verified the change by testing locally.* diff --git a/.github/workflows/ci_backend.yml b/.github/workflows/ci_backend.yml new file mode 100644 index 0000000000..e527c3c4a2 --- /dev/null +++ b/.github/workflows/ci_backend.yml @@ -0,0 +1,64 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# + +name: Backend + +on: + push: + paths: + - '.github/workflows/ci_backend.yml' + - 'package.xml' + - 'pom.xml' + - 'dolphinscheduler-alert/**' + - 'dolphinscheduler-api/**' + - 'dolphinscheduler-common/**' + - 'dolphinscheduler-dao/**' + - 'dolphinscheduler-rpc/**' + - 'dolphinscheduler-server/**' + pull_request: + paths: + - '.github/workflows/ci_backend.yml' + - 'package.xml' + - 'pom.xml' + - 'dolphinscheduler-alert/**' + - 'dolphinscheduler-api/**' + - 'dolphinscheduler-common/**' + - 'dolphinscheduler-dao/**' + - 'dolphinscheduler-rpc/**' + - 'dolphinscheduler-server/**' + +jobs: + Compile-check: + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v1 + - name: Set up JDK 1.8 + uses: actions/setup-java@v1 + with: + java-version: 1.8 + - name: Compile + run: mvn -U -B -T 1C clean install -Prelease -Dmaven.compile.fork=true -Dmaven.test.skip=true + License-check: + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v1 + - name: Set up JDK 1.8 + uses: actions/setup-java@v1 + with: + java-version: 1.8 + - name: Check + run: mvn -B apache-rat:check diff --git a/.github/workflows/ci_frontend.yml b/.github/workflows/ci_frontend.yml new file mode 100644 index 0000000000..fab75c6341 --- /dev/null +++ b/.github/workflows/ci_frontend.yml @@ -0,0 +1,58 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# + +name: Frontend + +on: + push: + paths: + - '.github/workflows/ci_frontend.yml' + - 'dolphinscheduler-ui/**' + pull_request: + paths: + - '.github/workflows/ci_frontend.yml' + - 'dolphinscheduler-ui/**' + +jobs: + Compile-check: + runs-on: ${{ matrix.os }} + strategy: + matrix: + os: [ubuntu-latest, macos-latest] + steps: + - uses: actions/checkout@v1 + - name: Set up Node.js + uses: actions/setup-node@v1 + with: + version: 8 + - name: Compile + run: | + cd dolphinscheduler-ui + npm install node-sass --unsafe-perm + npm install + npm run build + + License-check: + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v1 + - name: Set up JDK 1.8 + uses: actions/setup-java@v1 + with: + java-version: 1.8 + - name: Check + run: mvn -B apache-rat:check diff --git a/.gitignore b/.gitignore index 40078a2fa2..2ef5f5d1e6 100644 --- a/.gitignore +++ b/.gitignore @@ -35,112 +35,112 @@ config.gypi test/coverage /docs/zh_CN/介绍 /docs/zh_CN/贡献代码.md -/escheduler-common/src/main/resources/zookeeper.properties -escheduler-alert/logs/ -escheduler-alert/src/main/resources/alert.properties_bak -escheduler-alert/src/main/resources/logback.xml -escheduler-server/src/main/resources/logback.xml -escheduler-ui/dist/css/common.16ac5d9.css -escheduler-ui/dist/css/home/index.b444b91.css -escheduler-ui/dist/css/login/index.5866c64.css -escheduler-ui/dist/js/0.ac94e5d.js -escheduler-ui/dist/js/0.ac94e5d.js.map -escheduler-ui/dist/js/1.0b043a3.js -escheduler-ui/dist/js/1.0b043a3.js.map -escheduler-ui/dist/js/10.1bce3dc.js -escheduler-ui/dist/js/10.1bce3dc.js.map -escheduler-ui/dist/js/11.79f04d8.js -escheduler-ui/dist/js/11.79f04d8.js.map -escheduler-ui/dist/js/12.420daa5.js -escheduler-ui/dist/js/12.420daa5.js.map -escheduler-ui/dist/js/13.e5bae1c.js -escheduler-ui/dist/js/13.e5bae1c.js.map -escheduler-ui/dist/js/14.f2a0dca.js -escheduler-ui/dist/js/14.f2a0dca.js.map -escheduler-ui/dist/js/15.45373e8.js -escheduler-ui/dist/js/15.45373e8.js.map -escheduler-ui/dist/js/16.fecb0fc.js -escheduler-ui/dist/js/16.fecb0fc.js.map -escheduler-ui/dist/js/17.84be279.js -escheduler-ui/dist/js/17.84be279.js.map -escheduler-ui/dist/js/18.307ea70.js -escheduler-ui/dist/js/18.307ea70.js.map -escheduler-ui/dist/js/19.144db9c.js -escheduler-ui/dist/js/19.144db9c.js.map -escheduler-ui/dist/js/2.8b4ef29.js -escheduler-ui/dist/js/2.8b4ef29.js.map -escheduler-ui/dist/js/20.4c527e9.js -escheduler-ui/dist/js/20.4c527e9.js.map -escheduler-ui/dist/js/21.831b2a2.js -escheduler-ui/dist/js/21.831b2a2.js.map -escheduler-ui/dist/js/22.2b4bb2a.js -escheduler-ui/dist/js/22.2b4bb2a.js.map -escheduler-ui/dist/js/23.81467ef.js -escheduler-ui/dist/js/23.81467ef.js.map -escheduler-ui/dist/js/24.54a00e4.js -escheduler-ui/dist/js/24.54a00e4.js.map -escheduler-ui/dist/js/25.8d7bd36.js -escheduler-ui/dist/js/25.8d7bd36.js.map -escheduler-ui/dist/js/26.2ec5e78.js -escheduler-ui/dist/js/26.2ec5e78.js.map -escheduler-ui/dist/js/27.3ab48c2.js -escheduler-ui/dist/js/27.3ab48c2.js.map -escheduler-ui/dist/js/28.363088a.js -escheduler-ui/dist/js/28.363088a.js.map -escheduler-ui/dist/js/29.6c5853a.js -escheduler-ui/dist/js/29.6c5853a.js.map -escheduler-ui/dist/js/3.a0edb5b.js -escheduler-ui/dist/js/3.a0edb5b.js.map -escheduler-ui/dist/js/30.940fdd3.js -escheduler-ui/dist/js/30.940fdd3.js.map -escheduler-ui/dist/js/31.168a460.js -escheduler-ui/dist/js/31.168a460.js.map -escheduler-ui/dist/js/32.8df6594.js -escheduler-ui/dist/js/32.8df6594.js.map -escheduler-ui/dist/js/33.4480bbe.js -escheduler-ui/dist/js/33.4480bbe.js.map -escheduler-ui/dist/js/34.b407fe1.js -escheduler-ui/dist/js/34.b407fe1.js.map -escheduler-ui/dist/js/35.f340b0a.js -escheduler-ui/dist/js/35.f340b0a.js.map -escheduler-ui/dist/js/36.8880c2d.js -escheduler-ui/dist/js/36.8880c2d.js.map -escheduler-ui/dist/js/37.ea2a25d.js -escheduler-ui/dist/js/37.ea2a25d.js.map -escheduler-ui/dist/js/38.98a59ee.js -escheduler-ui/dist/js/38.98a59ee.js.map -escheduler-ui/dist/js/39.a5e958a.js -escheduler-ui/dist/js/39.a5e958a.js.map -escheduler-ui/dist/js/4.4ca44db.js -escheduler-ui/dist/js/4.4ca44db.js.map -escheduler-ui/dist/js/40.e187b1e.js -escheduler-ui/dist/js/40.e187b1e.js.map -escheduler-ui/dist/js/41.0e89182.js -escheduler-ui/dist/js/41.0e89182.js.map -escheduler-ui/dist/js/42.341047c.js -escheduler-ui/dist/js/42.341047c.js.map -escheduler-ui/dist/js/43.27b8228.js -escheduler-ui/dist/js/43.27b8228.js.map -escheduler-ui/dist/js/44.e8869bc.js -escheduler-ui/dist/js/44.e8869bc.js.map -escheduler-ui/dist/js/45.8d54901.js -escheduler-ui/dist/js/45.8d54901.js.map -escheduler-ui/dist/js/5.e1ed7f3.js -escheduler-ui/dist/js/5.e1ed7f3.js.map -escheduler-ui/dist/js/6.241ba07.js -escheduler-ui/dist/js/6.241ba07.js.map -escheduler-ui/dist/js/7.ab2e297.js -escheduler-ui/dist/js/7.ab2e297.js.map -escheduler-ui/dist/js/8.83ff814.js -escheduler-ui/dist/js/8.83ff814.js.map -escheduler-ui/dist/js/9.39cb29f.js -escheduler-ui/dist/js/9.39cb29f.js.map -escheduler-ui/dist/js/common.733e342.js -escheduler-ui/dist/js/common.733e342.js.map -escheduler-ui/dist/js/home/index.78a5d12.js -escheduler-ui/dist/js/home/index.78a5d12.js.map -escheduler-ui/dist/js/login/index.291b8e3.js -escheduler-ui/dist/js/login/index.291b8e3.js.map -escheduler-ui/dist/lib/external/ -escheduler-ui/src/js/conf/home/pages/projects/pages/taskInstance/index.vue -/escheduler-dao/src/main/resources/dao/data_source.properties +/dolphinscheduler-common/src/main/resources/zookeeper.properties +dolphinscheduler-alert/logs/ +dolphinscheduler-alert/src/main/resources/alert.properties_bak +dolphinscheduler-alert/src/main/resources/logback.xml +dolphinscheduler-server/src/main/resources/logback.xml +dolphinscheduler-ui/dist/css/common.16ac5d9.css +dolphinscheduler-ui/dist/css/home/index.b444b91.css +dolphinscheduler-ui/dist/css/login/index.5866c64.css +dolphinscheduler-ui/dist/js/0.ac94e5d.js +dolphinscheduler-ui/dist/js/0.ac94e5d.js.map +dolphinscheduler-ui/dist/js/1.0b043a3.js +dolphinscheduler-ui/dist/js/1.0b043a3.js.map +dolphinscheduler-ui/dist/js/10.1bce3dc.js +dolphinscheduler-ui/dist/js/10.1bce3dc.js.map +dolphinscheduler-ui/dist/js/11.79f04d8.js +dolphinscheduler-ui/dist/js/11.79f04d8.js.map +dolphinscheduler-ui/dist/js/12.420daa5.js +dolphinscheduler-ui/dist/js/12.420daa5.js.map +dolphinscheduler-ui/dist/js/13.e5bae1c.js +dolphinscheduler-ui/dist/js/13.e5bae1c.js.map +dolphinscheduler-ui/dist/js/14.f2a0dca.js +dolphinscheduler-ui/dist/js/14.f2a0dca.js.map +dolphinscheduler-ui/dist/js/15.45373e8.js +dolphinscheduler-ui/dist/js/15.45373e8.js.map +dolphinscheduler-ui/dist/js/16.fecb0fc.js +dolphinscheduler-ui/dist/js/16.fecb0fc.js.map +dolphinscheduler-ui/dist/js/17.84be279.js +dolphinscheduler-ui/dist/js/17.84be279.js.map +dolphinscheduler-ui/dist/js/18.307ea70.js +dolphinscheduler-ui/dist/js/18.307ea70.js.map +dolphinscheduler-ui/dist/js/19.144db9c.js +dolphinscheduler-ui/dist/js/19.144db9c.js.map +dolphinscheduler-ui/dist/js/2.8b4ef29.js +dolphinscheduler-ui/dist/js/2.8b4ef29.js.map +dolphinscheduler-ui/dist/js/20.4c527e9.js +dolphinscheduler-ui/dist/js/20.4c527e9.js.map +dolphinscheduler-ui/dist/js/21.831b2a2.js +dolphinscheduler-ui/dist/js/21.831b2a2.js.map +dolphinscheduler-ui/dist/js/22.2b4bb2a.js +dolphinscheduler-ui/dist/js/22.2b4bb2a.js.map +dolphinscheduler-ui/dist/js/23.81467ef.js +dolphinscheduler-ui/dist/js/23.81467ef.js.map +dolphinscheduler-ui/dist/js/24.54a00e4.js +dolphinscheduler-ui/dist/js/24.54a00e4.js.map +dolphinscheduler-ui/dist/js/25.8d7bd36.js +dolphinscheduler-ui/dist/js/25.8d7bd36.js.map +dolphinscheduler-ui/dist/js/26.2ec5e78.js +dolphinscheduler-ui/dist/js/26.2ec5e78.js.map +dolphinscheduler-ui/dist/js/27.3ab48c2.js +dolphinscheduler-ui/dist/js/27.3ab48c2.js.map +dolphinscheduler-ui/dist/js/28.363088a.js +dolphinscheduler-ui/dist/js/28.363088a.js.map +dolphinscheduler-ui/dist/js/29.6c5853a.js +dolphinscheduler-ui/dist/js/29.6c5853a.js.map +dolphinscheduler-ui/dist/js/3.a0edb5b.js +dolphinscheduler-ui/dist/js/3.a0edb5b.js.map +dolphinscheduler-ui/dist/js/30.940fdd3.js +dolphinscheduler-ui/dist/js/30.940fdd3.js.map +dolphinscheduler-ui/dist/js/31.168a460.js +dolphinscheduler-ui/dist/js/31.168a460.js.map +dolphinscheduler-ui/dist/js/32.8df6594.js +dolphinscheduler-ui/dist/js/32.8df6594.js.map +dolphinscheduler-ui/dist/js/33.4480bbe.js +dolphinscheduler-ui/dist/js/33.4480bbe.js.map +dolphinscheduler-ui/dist/js/34.b407fe1.js +dolphinscheduler-ui/dist/js/34.b407fe1.js.map +dolphinscheduler-ui/dist/js/35.f340b0a.js +dolphinscheduler-ui/dist/js/35.f340b0a.js.map +dolphinscheduler-ui/dist/js/36.8880c2d.js +dolphinscheduler-ui/dist/js/36.8880c2d.js.map +dolphinscheduler-ui/dist/js/37.ea2a25d.js +dolphinscheduler-ui/dist/js/37.ea2a25d.js.map +dolphinscheduler-ui/dist/js/38.98a59ee.js +dolphinscheduler-ui/dist/js/38.98a59ee.js.map +dolphinscheduler-ui/dist/js/39.a5e958a.js +dolphinscheduler-ui/dist/js/39.a5e958a.js.map +dolphinscheduler-ui/dist/js/4.4ca44db.js +dolphinscheduler-ui/dist/js/4.4ca44db.js.map +dolphinscheduler-ui/dist/js/40.e187b1e.js +dolphinscheduler-ui/dist/js/40.e187b1e.js.map +dolphinscheduler-ui/dist/js/41.0e89182.js +dolphinscheduler-ui/dist/js/41.0e89182.js.map +dolphinscheduler-ui/dist/js/42.341047c.js +dolphinscheduler-ui/dist/js/42.341047c.js.map +dolphinscheduler-ui/dist/js/43.27b8228.js +dolphinscheduler-ui/dist/js/43.27b8228.js.map +dolphinscheduler-ui/dist/js/44.e8869bc.js +dolphinscheduler-ui/dist/js/44.e8869bc.js.map +dolphinscheduler-ui/dist/js/45.8d54901.js +dolphinscheduler-ui/dist/js/45.8d54901.js.map +dolphinscheduler-ui/dist/js/5.e1ed7f3.js +dolphinscheduler-ui/dist/js/5.e1ed7f3.js.map +dolphinscheduler-ui/dist/js/6.241ba07.js +dolphinscheduler-ui/dist/js/6.241ba07.js.map +dolphinscheduler-ui/dist/js/7.ab2e297.js +dolphinscheduler-ui/dist/js/7.ab2e297.js.map +dolphinscheduler-ui/dist/js/8.83ff814.js +dolphinscheduler-ui/dist/js/8.83ff814.js.map +dolphinscheduler-ui/dist/js/9.39cb29f.js +dolphinscheduler-ui/dist/js/9.39cb29f.js.map +dolphinscheduler-ui/dist/js/common.733e342.js +dolphinscheduler-ui/dist/js/common.733e342.js.map +dolphinscheduler-ui/dist/js/home/index.78a5d12.js +dolphinscheduler-ui/dist/js/home/index.78a5d12.js.map +dolphinscheduler-ui/dist/js/login/index.291b8e3.js +dolphinscheduler-ui/dist/js/login/index.291b8e3.js.map +dolphinscheduler-ui/dist/lib/external/ +dolphinscheduler-ui/src/js/conf/home/pages/projects/pages/taskInstance/index.vue +/dolphinscheduler-dao/src/main/resources/dao/data_source.properties diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index be32e77143..8ed9aac897 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -1,4 +1,4 @@ -* First from the remote repository *https://github.com/analysys/EasyScheduler.git* fork code to your own repository +* First from the remote repository *https://github.com/apache/incubator-dolphinscheduler.git* fork code to your own repository * there are three branches in the remote repository currently: * master normal delivery branch @@ -7,17 +7,14 @@ * dev daily development branch The daily development branch, the newly submitted code can pull requests to this branch. - * branch-1.0.0 release version branch - Release version branch, there will be 2.0 ... and other version branches, the version - branch only changes the error, does not add new features. * Clone your own warehouse to your local - `git clone https://github.com/analysys/EasyScheduler.git` + `git clone https://github.com/apache/incubator-dolphinscheduler.git` * Add remote repository address, named upstream - `git remote add upstream https://github.com/analysys/EasyScheduler.git` + `git remote add upstream https://github.com/apache/incubator-dolphinscheduler.git` * View repository: @@ -63,71 +60,6 @@ git push --set-upstream origin dev1.0 * Next, the administrator is responsible for **merging** to complete the pull request ---- - -* 首先从远端仓库*https://github.com/analysys/EasyScheduler.git* fork一份代码到自己的仓库中 - -* 远端仓库中目前有三个分支: - * master 正常交付分支 - 发布稳定版本以后,将稳定版本分支的代码合并到master上。 - - * dev 日常开发分支 - 日常dev开发分支,新提交的代码都可以pull request到这个分支上。 - - * branch-1.0.0 发布版本分支 - 发布版本分支,后续会有2.0...等版本分支,版本分支只修改bug,不增加新功能。 - -* 把自己仓库clone到本地 - - `git clone https://github.com/analysys/EasyScheduler.git` - -* 添加远端仓库地址,命名为upstream - - ` git remote add upstream https://github.com/analysys/EasyScheduler.git ` - -* 查看仓库: - - ` git remote -v` - -> 此时会有两个仓库:origin(自己的仓库)和upstream(远端仓库) - -* 获取/更新远端仓库代码(已经是最新代码,就跳过) - - `git fetch upstream ` - - -* 同步远端仓库代码到本地仓库 - -``` - git checkout origin/dev - git merge --no-ff upstream/dev -``` - -如果远端分支有新加的分支`dev-1.0`,需要同步这个分支到本地仓库 - -``` -git checkout -b dev-1.0 upstream/dev-1.0 -git push --set-upstream origin dev1.0 -``` - -* 在本地修改代码以后,提交到自己仓库: - - `git commit -m 'test commit'` - `git push` - -* 将修改提交到远端仓库 - - * 在github页面,点击New pull request. -

- -

- - * 选择修改完的本地分支和要合并过去的分支,Create pull request. -

- -

- -* 接下来由管理员负责将**Merge**完成此次pull request diff --git a/DISCLAIMER b/DISCLAIMER new file mode 100644 index 0000000000..1c269cd696 --- /dev/null +++ b/DISCLAIMER @@ -0,0 +1,5 @@ +Apache DolphinScheduler (incubating) is an effort undergoing incubation at The Apache Software Foundation (ASF), sponsored by the Apache Incubator PMC. +Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, +communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. +While incubation status is not necessarily a reflection of the completeness or stability of the code, +it does indicate that the project has yet to be fully endorsed by the ASF. diff --git a/NOTICE b/NOTICE index 26802e12b6..72b5f0632c 100644 --- a/NOTICE +++ b/NOTICE @@ -1,7 +1,5 @@ -Easy Scheduler -Copyright 2019 The Analysys Foundation +Apache DolphinScheduler (incubating) +Copyright 2019 The Apache Software Foundation This product includes software developed at -The Analysys Foundation (https://www.analysys.cn/). - - +The Apache Software Foundation (http://www.apache.org/). diff --git a/README.md b/README.md index 6352bd5f10..b4a7e5c7cd 100644 --- a/README.md +++ b/README.md @@ -11,7 +11,7 @@ Dolphin Scheduler [![CN doc](https://img.shields.io/badge/文档-中文版-blue.svg)](README_zh_CN.md) -### Design features: +### Design features: A distributed and easy-to-expand visual DAG workflow scheduling system. Dedicated to solving the complex dependencies in data processing, making the scheduling system `out of the box` for data processing. Its main objectives are as follows: @@ -36,8 +36,8 @@ Its main objectives are as follows: Stability | Easy to use | Features | Scalability | -- | -- | -- | -- -Decentralized multi-master and multi-worker | Visualization process defines key information such as task status, task type, retry times, task running machine, visual variables and so on at a glance.  |  Support pause, recover operation | support custom task types -HA is supported by itself | All process definition operations are visualized, dragging tasks to draw DAGs, configuring data sources and resources. At the same time, for third-party systems, the api mode operation is provided. | Users on DolphinScheduler can achieve many-to-one or one-to-one mapping relationship through tenants and Hadoop users, which is very important for scheduling large data jobs. " | The scheduler uses distributed scheduling, and the overall scheduling capability will increase linearly with the scale of the cluster. Master and Worker support dynamic online and offline. +Decentralized multi-master and multi-worker | Visualization process defines key information such as task status, task type, retry times, task running machine, visual variables and so on at a glance.  |  Support pause, recover operation | support custom task types +HA is supported by itself | All process definition operations are visualized, dragging tasks to draw DAGs, configuring data sources and resources. At the same time, for third-party systems, the api mode operation is provided. | Users on DolphinScheduler can achieve many-to-one or one-to-one mapping relationship through tenants and Hadoop users, which is very important for scheduling large data jobs. " | The scheduler uses distributed scheduling, and the overall scheduling capability will increase linearly with the scale of the cluster. Master and Worker support dynamic online and offline. Overload processing: Task queue mechanism, the number of schedulable tasks on a single machine can be flexibly configured, when too many tasks will be cached in the task queue, will not cause machine jam. | One-click deployment | Supports traditional shell tasks, and also support big data platform task scheduling: MR, Spark, SQL (mysql, postgresql, hive, sparksql), Python, Procedure, Sub_Process | | @@ -58,11 +58,11 @@ Overload processing: Task queue mechanism, the number of schedulable tasks on a - Front-end deployment documentation -- [**User manual**](https://dolphinscheduler.apache.org/en-us/docs/user_doc/system-manual.html?_blank "System manual") +- [**User manual**](https://dolphinscheduler.apache.org/en-us/docs/user_doc/system-manual.html?_blank "System manual") -- [**Upgrade document**](https://dolphinscheduler.apache.org/en-us/docs/release/upgrade.html?_blank "Upgrade document") +- [**Upgrade document**](https://dolphinscheduler.apache.org/en-us/docs/release/upgrade.html?_blank "Upgrade document") -- Online Demo +- Online Demo More documentation please refer to [DolphinScheduler online documentation] @@ -74,6 +74,20 @@ Work plan of Dolphin Scheduler: [R&D plan](https://github.com/apache/incubator-d Welcome to participate in contributing code, please refer to the process of submitting the code: [[How to contribute code](https://github.com/apache/incubator-dolphinscheduler/issues/310)] +### How to Build + +```bash +mvn clean install -Prelease +``` + +Artifact: + +``` +dolphinscheduler-dist/dolphinscheduler-backend/target/apache-dolphinscheduler-incubating-${latest.release.version}-dolphinscheduler-backend-bin.tar.gz: Binary package of DolphinScheduler-Backend +dolphinscheduler-dist/dolphinscheduler-front/target/apache-dolphinscheduler-incubating-${latest.release.version}-dolphinscheduler-front-bin.tar.gz: Binary package of DolphinScheduler-UI +dolphinscheduler-dist/dolphinscheduler-src/target/apache-dolphinscheduler-incubating-${latest.release.version}-src.zip: Source code package of DolphinScheduler +``` + ### Thanks Dolphin Scheduler uses a lot of excellent open source projects, such as google guava, guice, grpc, netty, ali bonecp, quartz, and many open source projects of apache, etc. @@ -86,8 +100,8 @@ It is because of the shoulders of these open source projects that the birth of t ### License Please refer to [LICENSE](https://github.com/apache/incubator-dolphinscheduler/blob/dev/LICENSE) file. - - + + diff --git a/README_zh_CN.md b/README_zh_CN.md index f64fcda1f7..6bdf7be183 100644 --- a/README_zh_CN.md +++ b/README_zh_CN.md @@ -1,13 +1,13 @@ Dolphin Scheduler ============ [![License](https://img.shields.io/badge/license-Apache%202-4EB1BA.svg)](https://www.apache.org/licenses/LICENSE-2.0.html) -[![Total Lines](https://tokei.rs/b1/github/analysys/EasyScheduler?category=lines)](https://github.com/analysys/EasyScheduler) +[![Total Lines](https://tokei.rs/b1/github/apache/Incubator-DolphinScheduler?category=lines)](https://github.com/apache/Incubator-DolphinScheduler) > Dolphin Scheduler for Big Data -[![Stargazers over time](https://starchart.cc/analysys/EasyScheduler.svg)](https://starchart.cc/analysys/EasyScheduler) +[![Stargazers over time](https://starchart.cc/apache/incubator-dolphinscheduler.svg)](https://starchart.cc/apache/incubator-dolphinscheduler) [![CN doc](https://img.shields.io/badge/文档-中文版-blue.svg)](README_zh_CN.md) [![EN doc](https://img.shields.io/badge/document-English-blue.svg)](README.md) @@ -45,11 +45,11 @@ Dolphin Scheduler - 前端部署文档 -- [**使用手册**](https://dolphinscheduler.apache.org/zh-cn/docs/user_doc/system-manual.html?_blank "系统使用手册") +- [**使用手册**](https://dolphinscheduler.apache.org/zh-cn/docs/user_doc/system-manual.html?_blank "系统使用手册") -- [**升级文档**](https://dolphinscheduler.apache.org/zh-cn/docs/release/upgrade.html?_blank "升级文档") +- [**升级文档**](https://dolphinscheduler.apache.org/zh-cn/docs/release/upgrade.html?_blank "升级文档") -- 我要体验 +- 我要体验 更多文档请参考 DolphinScheduler中文在线文档 @@ -63,11 +63,24 @@ DolphinScheduler的工作计划:> /etc/apt/sources.list - -RUN echo "mysql-server mysql-server/root_password password root" | debconf-set-selections -RUN echo "mysql-server mysql-server/root_password_again password root" | debconf-set-selections - +#5,install postgresql RUN apt-get update && \ - apt-get -y install mysql-server-5.7 && \ - mkdir -p /var/lib/mysql && \ - mkdir -p /var/run/mysqld && \ - mkdir -p /var/log/mysql && \ - chown -R mysql:mysql /var/lib/mysql && \ - chown -R mysql:mysql /var/run/mysqld && \ - chown -R mysql:mysql /var/log/mysql + apt-get install -y postgresql postgresql-contrib sudo && \ + sed -i 's/localhost/*/g' /etc/postgresql/10/main/postgresql.conf - -# UTF-8 and bind-address -RUN sed -i -e "$ a [client]\n\n[mysql]\n\n[mysqld]" /etc/mysql/my.cnf && \ - sed -i -e "s/\(\[client\]\)/\1\ndefault-character-set = utf8/g" /etc/mysql/my.cnf && \ - sed -i -e "s/\(\[mysql\]\)/\1\ndefault-character-set = utf8/g" /etc/mysql/my.cnf && \ - sed -i -e "s/\(\[mysqld\]\)/\1\ninit_connect='SET NAMES utf8'\ncharacter-set-server = utf8\ncollation-server=utf8_general_ci\nbind-address = 0.0.0.0/g" /etc/mysql/my.cnf - - -#9,安装nginx +#6,install nginx RUN apt-get update && \ apt-get install -y nginx && \ rm -rf /var/lib/apt/lists/* && \ echo "\ndaemon off;" >> /etc/nginx/nginx.conf && \ chown -R www-data:www-data /var/lib/nginx -#10,修改escheduler配置文件 -#后端配置 -RUN mkdir -p /opt/escheduler && \ - tar -zxvf /opt/easyscheduler_source/target/escheduler-${tar_version}.tar.gz -C /opt/escheduler && \ - rm -rf /opt/escheduler/conf -ADD ./conf/escheduler/conf /opt/escheduler/conf -#前端nginx配置 -ADD ./conf/nginx/default.conf /etc/nginx/conf.d - -#11,开放端口 -EXPOSE 2181 2888 3888 3306 80 12345 8888 - -#12,安装sudo,python,vim,ping和ssh +#7,install sudo,python,vim,ping and ssh command RUN apt-get update && \ apt-get -y install sudo && \ apt-get -y install python && \ @@ -132,15 +93,44 @@ RUN apt-get update && \ apt-get -y install python-pip && \ pip install kazoo -COPY ./startup.sh /root/startup.sh -#13,修改权限和设置软连 +#8,add dolphinscheduler source code to /opt/dolphinscheduler_source +ADD . /opt/dolphinscheduler_source + + +#9,backend compilation +RUN cd /opt/dolphinscheduler_source && \ + mvn clean package -Prelease -Dmaven.test.skip=true + +#10,frontend compilation +RUN chmod -R 777 /opt/dolphinscheduler_source/dolphinscheduler-ui && \ + cd /opt/dolphinscheduler_source/dolphinscheduler-ui && \ + rm -rf /opt/dolphinscheduler_source/dolphinscheduler-ui/node_modules && \ + npm install node-sass --unsafe-perm && \ + npm install && \ + npm run build + +#11,modify dolphinscheduler configuration file +#backend configuration +RUN tar -zxvf /opt/dolphinscheduler_source/dolphinscheduler-dist/dolphinscheduler-backend/target/apache-dolphinscheduler-incubating-${tar_version}-dolphinscheduler-backend-bin.tar.gz -C /opt && \ + mv /opt/apache-dolphinscheduler-incubating-${tar_version}-dolphinscheduler-backend-bin /opt/dolphinscheduler && \ + rm -rf /opt/dolphinscheduler/conf + +ADD ./dockerfile/conf/dolphinscheduler/conf /opt/dolphinscheduler/conf +#frontend nginx configuration +ADD ./dockerfile/conf/nginx/dolphinscheduler.conf /etc/nginx/conf.d + +#12,open port +EXPOSE 2181 2888 3888 3306 80 12345 8888 + +COPY ./dockerfile/startup.sh /root/startup.sh +#13,modify permissions and set soft links RUN chmod +x /root/startup.sh && \ - chmod +x /opt/escheduler/script/create_escheduler.sh && \ + chmod +x /opt/dolphinscheduler/script/create-dolphinscheduler.sh && \ chmod +x /opt/zookeeper/bin/zkServer.sh && \ - chmod +x /opt/escheduler/bin/escheduler-daemon.sh && \ + chmod +x /opt/dolphinscheduler/bin/dolphinscheduler-daemon.sh && \ rm -rf /bin/sh && \ ln -s /bin/bash /bin/sh && \ mkdir -p /tmp/xls -ENTRYPOINT ["/root/startup.sh"] +ENTRYPOINT ["/root/startup.sh"] \ No newline at end of file diff --git a/dockerfile/README.md b/dockerfile/README.md new file mode 100644 index 0000000000..33b58cacde --- /dev/null +++ b/dockerfile/README.md @@ -0,0 +1,11 @@ +## Build Image +``` + cd .. + docker build -t dolphinscheduler --build-arg version=1.1.0 --build-arg tar_version=1.1.0-SNAPSHOT -f dockerfile/Dockerfile . + docker run -p 12345:12345 -p 8888:8888 --rm --name dolphinscheduler -d dolphinscheduler +``` +* Visit the url: http://127.0.0.1:8888 +* UserName:admin Password:dolphinscheduler123 + +## Note +* MacOS: The memory of docker needs to be set to 4G, default 2G. Steps: Preferences -> Advanced -> adjust resources -> Apply & Restart diff --git a/dockerfile/conf/dolphinscheduler/conf/alert.properties b/dockerfile/conf/dolphinscheduler/conf/alert.properties new file mode 100644 index 0000000000..276ef3132a --- /dev/null +++ b/dockerfile/conf/dolphinscheduler/conf/alert.properties @@ -0,0 +1,50 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# + +#alert type is EMAIL/SMS +alert.type=EMAIL + +# mail server configuration +mail.protocol=SMTP +mail.server.host=smtp.126.com +mail.server.port= +mail.sender=dolphinscheduler@126.com +mail.user=dolphinscheduler@126.com +mail.passwd=escheduler123 + +# TLS +mail.smtp.starttls.enable=false +# SSL +mail.smtp.ssl.enable=true +mail.smtp.ssl.trust=smtp.126.com + +#xls file path,need create if not exist +xls.file.path=/tmp/xls + +# Enterprise WeChat configuration +enterprise.wechat.enable=false +enterprise.wechat.corp.id=xxxxxxx +enterprise.wechat.secret=xxxxxxx +enterprise.wechat.agent.id=xxxxxxx +enterprise.wechat.users=xxxxxxx +enterprise.wechat.token.url=https://qyapi.weixin.qq.com/cgi-bin/gettoken?corpid=$corpId&corpsecret=$secret +enterprise.wechat.push.url=https://qyapi.weixin.qq.com/cgi-bin/message/send?access_token=$token +enterprise.wechat.team.send.msg={\"toparty\":\"$toParty\",\"agentid\":\"$agentId\",\"msgtype\":\"text\",\"text\":{\"content\":\"$msg\"},\"safe\":\"0\"} +enterprise.wechat.user.send.msg={\"touser\":\"$toUser\",\"agentid\":\"$agentId\",\"msgtype\":\"markdown\",\"markdown\":{\"content\":\"$msg\"}} + + + diff --git a/dockerfile/conf/dolphinscheduler/conf/alert_logback.xml b/dockerfile/conf/dolphinscheduler/conf/alert_logback.xml new file mode 100644 index 0000000000..35e19865b9 --- /dev/null +++ b/dockerfile/conf/dolphinscheduler/conf/alert_logback.xml @@ -0,0 +1,49 @@ + + + + + + + + + + [%level] %date{yyyy-MM-dd HH:mm:ss.SSS} %logger{96}:[%line] - %msg%n + + UTF-8 + + + + + ${log.base}/dolphinscheduler-alert.log + + ${log.base}/dolphinscheduler-alert.%d{yyyy-MM-dd_HH}.%i.log + 20 + 64MB + + + + [%level] %date{yyyy-MM-dd HH:mm:ss.SSS} %logger{96}:[%line] - %msg%n + + UTF-8 + + + + + + + \ No newline at end of file diff --git a/dockerfile/conf/escheduler/conf/apiserver_logback.xml b/dockerfile/conf/dolphinscheduler/conf/apiserver_logback.xml similarity index 53% rename from dockerfile/conf/escheduler/conf/apiserver_logback.xml rename to dockerfile/conf/dolphinscheduler/conf/apiserver_logback.xml index 43e6af951a..36719671c9 100644 --- a/dockerfile/conf/escheduler/conf/apiserver_logback.xml +++ b/dockerfile/conf/dolphinscheduler/conf/apiserver_logback.xml @@ -1,3 +1,21 @@ + + + @@ -20,9 +38,9 @@ INFO - ${log.base}/escheduler-api-server.log + ${log.base}/dolphinscheduler-api-server.log - ${log.base}/escheduler-api-server.%d{yyyy-MM-dd_HH}.%i.log + ${log.base}/dolphinscheduler-api-server.%d{yyyy-MM-dd_HH}.%i.log 168 64MB diff --git a/dockerfile/conf/dolphinscheduler/conf/application-api.properties b/dockerfile/conf/dolphinscheduler/conf/application-api.properties new file mode 100644 index 0000000000..ead8dd872e --- /dev/null +++ b/dockerfile/conf/dolphinscheduler/conf/application-api.properties @@ -0,0 +1,40 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# + +logging.config=classpath:apiserver_logback.xml + +# server port +server.port=12345 + +# session config +server.servlet.session.timeout=7200 + +server.servlet.context-path=/dolphinscheduler/ + +# file size limit for upload +spring.servlet.multipart.max-file-size=1024MB +spring.servlet.multipart.max-request-size=1024MB + +#post content +server.jetty.max-http-post-size=5000000 + +spring.messages.encoding=UTF-8 + +#i18n classpath folder , file prefix messages, if have many files, use "," seperator +spring.messages.basename=i18n/messages + + diff --git a/dockerfile/conf/escheduler/conf/dao/data_source.properties b/dockerfile/conf/dolphinscheduler/conf/application-dao.properties similarity index 51% rename from dockerfile/conf/escheduler/conf/dao/data_source.properties rename to dockerfile/conf/dolphinscheduler/conf/application-dao.properties index 0dce2943e4..166c36fbf0 100644 --- a/dockerfile/conf/escheduler/conf/dao/data_source.properties +++ b/dockerfile/conf/dolphinscheduler/conf/application-dao.properties @@ -1,7 +1,25 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# + # base spring data source configuration spring.datasource.type=com.alibaba.druid.pool.DruidDataSource -spring.datasource.driver-class-name=com.mysql.jdbc.Driver -spring.datasource.url=jdbc:mysql://127.0.0.1:3306/escheduler?characterEncoding=UTF-8 +# postgresql +spring.datasource.driver-class-name=org.postgresql.Driver +spring.datasource.url=jdbc:postgresql://127.0.0.1:5432/dolphinscheduler spring.datasource.username=root spring.datasource.password=root@123 @@ -27,6 +45,7 @@ spring.datasource.minEvictableIdleTimeMillis=300000 #the SQL used to check whether the connection is valid requires a query statement. If validation Query is null, testOnBorrow, testOnReturn, and testWhileIdle will not work. spring.datasource.validationQuery=SELECT 1 + #check whether the connection is valid for timeout, in seconds spring.datasource.validationQueryTimeout=3 @@ -45,9 +64,40 @@ spring.datasource.keepAlive=true spring.datasource.poolPreparedStatements=true spring.datasource.maxPoolPreparedStatementPerConnectionSize=20 +spring.datasource.spring.datasource.filters=stat,wall,log4j +spring.datasource.connectionProperties=druid.stat.mergeSql=true;druid.stat.slowSqlMillis=5000 + +#mybatis +mybatis-plus.mapper-locations=classpath*:/org.apache.dolphinscheduler.dao.mapper/*.xml + +mybatis-plus.typeEnumsPackage=org.apache.dolphinscheduler.*.enums + +#Entity scan, where multiple packages are separated by a comma or semicolon +mybatis-plus.typeAliasesPackage=org.apache.dolphinscheduler.dao.entity + +#Primary key type AUTO:" database ID AUTO ", INPUT:" user INPUT ID", ID_WORKER:" global unique ID (numeric type unique ID)", UUID:" global unique ID UUID"; +mybatis-plus.global-config.db-config.id-type=AUTO + +#Field policy IGNORED:" ignore judgment ",NOT_NULL:" not NULL judgment "),NOT_EMPTY:" not NULL judgment" +mybatis-plus.global-config.db-config.field-strategy=NOT_NULL + +#The hump underline is converted +mybatis-plus.global-config.db-config.column-underline=true +mybatis-plus.global-config.db-config.logic-delete-value=-1 +mybatis-plus.global-config.db-config.logic-not-delete-value=0 +mybatis-plus.global-config.db-config.banner=false +#The original configuration +mybatis-plus.configuration.map-underscore-to-camel-case=true +mybatis-plus.configuration.cache-enabled=false +mybatis-plus.configuration.call-setters-on-nulls=true +mybatis-plus.configuration.jdbc-type-for-null=null + # data quality analysis is not currently in use. please ignore the following configuration # task record flag task.record.flag=false task.record.datasource.url=jdbc:mysql://192.168.xx.xx:3306/etl?characterEncoding=UTF-8 task.record.datasource.username=xx task.record.datasource.password=xx + +# Logger Config +#logging.level.org.apache.dolphinscheduler.dao=debug diff --git a/dockerfile/conf/dolphinscheduler/conf/combined_logback.xml b/dockerfile/conf/dolphinscheduler/conf/combined_logback.xml new file mode 100644 index 0000000000..6bdb97cf00 --- /dev/null +++ b/dockerfile/conf/dolphinscheduler/conf/combined_logback.xml @@ -0,0 +1,80 @@ + + + + + + + + + + %highlight([%level]) %date{yyyy-MM-dd HH:mm:ss.SSS} %logger{10}:[%line] - %msg%n + + UTF-8 + + + + + INFO + + + + taskAppId + ${log.base} + + + + ${log.base}/${taskAppId}.log + + + [%level] %date{yyyy-MM-dd HH:mm:ss.SSS} %logger{96}:[%line] - %msg%n + + UTF-8 + + true + + + + + + ${log.base}/dolphinscheduler-combined.log + + INFO + + + + ${log.base}/dolphinscheduler-combined.%d{yyyy-MM-dd_HH}.%i.log + 168 + 200MB + +       + + + [%level] %date{yyyy-MM-dd HH:mm:ss.SSS} %logger{96}:[%line] - %msg%n + + UTF-8 + +    + + + + + + + + + \ No newline at end of file diff --git a/dockerfile/conf/escheduler/conf/common/common.properties b/dockerfile/conf/dolphinscheduler/conf/common/common.properties similarity index 52% rename from dockerfile/conf/escheduler/conf/common/common.properties rename to dockerfile/conf/dolphinscheduler/conf/common/common.properties index 15af284597..5371c7665f 100644 --- a/dockerfile/conf/escheduler/conf/common/common.properties +++ b/dockerfile/conf/dolphinscheduler/conf/common/common.properties @@ -1,20 +1,37 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# + #task queue implementation, default "zookeeper" -escheduler.queue.impl=zookeeper +dolphinscheduler.queue.impl=zookeeper # user data directory path, self configuration, please make sure the directory exists and have read write permissions -data.basedir.path=/tmp/escheduler +data.basedir.path=/tmp/dolphinscheduler # directory path for user data download. self configuration, please make sure the directory exists and have read write permissions -data.download.basedir.path=/tmp/escheduler/download +data.download.basedir.path=/tmp/dolphinscheduler/download # process execute directory. self configuration, please make sure the directory exists and have read write permissions -process.exec.basepath=/tmp/escheduler/exec +process.exec.basepath=/tmp/dolphinscheduler/exec # Users who have permission to create directories under the HDFS root path hdfs.root.user=hdfs -# data base dir, resource file will store to this hadoop hdfs path, self configuration, please make sure the directory exists on hdfs and have read write permissions。"/escheduler" is recommended -data.store2hdfs.basepath=/escheduler +# data base dir, resource file will store to this hadoop hdfs path, self configuration, please make sure the directory exists on hdfs and have read write permissions。"/dolphinscheduler" is recommended +data.store2hdfs.basepath=/dolphinscheduler # resource upload startup type : HDFS,S3,NONE res.upload.startup.type=NONE @@ -32,7 +49,7 @@ login.user.keytab.username=hdfs-mycluster@ESZ.COM login.user.keytab.path=/opt/hdfs.headless.keytab # system env path. self configuration, please make sure the directory and file exists and have read write execute permissions -escheduler.env.path=/opt/escheduler/conf/env/.escheduler_env.sh +dolphinscheduler.env.path=/opt/dolphinscheduler/conf/env/.dolphinscheduler_env.sh #resource.view.suffixs resource.view.suffixs=txt,log,sh,conf,cfg,py,java,sql,hql,xml diff --git a/dockerfile/conf/dolphinscheduler/conf/common/hadoop/hadoop.properties b/dockerfile/conf/dolphinscheduler/conf/common/hadoop/hadoop.properties new file mode 100644 index 0000000000..2c19b4a52e --- /dev/null +++ b/dockerfile/conf/dolphinscheduler/conf/common/hadoop/hadoop.properties @@ -0,0 +1,35 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# + +# ha or single namenode,If namenode ha needs to copy core-site.xml and hdfs-site.xml +# to the conf directory,support s3,for example : s3a://dolphinscheduler +fs.defaultFS=hdfs://mycluster:8020 + +# s3 need,s3 endpoint +fs.s3a.endpoint=http://192.168.199.91:9010 + +# s3 need,s3 access key +fs.s3a.access.key=A3DXS30FO22544RE + +# s3 need,s3 secret key +fs.s3a.secret.key=OloCLq3n+8+sdPHUhJ21XrSxTC+JK + +#resourcemanager ha note this need ips , this empty if single +yarn.resourcemanager.ha.rm.ids=192.168.xx.xx,192.168.xx.xx + +# If it is a single resourcemanager, you only need to configure one host name. If it is resourcemanager HA, the default configuration is fine +yarn.application.status.address=http://ark1:8088/ws/v1/cluster/apps/%s \ No newline at end of file diff --git a/dockerfile/conf/dolphinscheduler/conf/config/install_config.conf b/dockerfile/conf/dolphinscheduler/conf/config/install_config.conf new file mode 100644 index 0000000000..196a78f49c --- /dev/null +++ b/dockerfile/conf/dolphinscheduler/conf/config/install_config.conf @@ -0,0 +1,20 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# + +installPath=/data1_1T/dolphinscheduler +deployUser=dolphinscheduler +ips=ark0,ark1,ark2,ark3,ark4 diff --git a/dockerfile/conf/dolphinscheduler/conf/config/run_config.conf b/dockerfile/conf/dolphinscheduler/conf/config/run_config.conf new file mode 100644 index 0000000000..69a28db458 --- /dev/null +++ b/dockerfile/conf/dolphinscheduler/conf/config/run_config.conf @@ -0,0 +1,21 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# + +masters=ark0,ark1 +workers=ark2,ark3,ark4 +alertServer=ark3 +apiServers=ark1 \ No newline at end of file diff --git a/dockerfile/conf/dolphinscheduler/conf/env/.dolphinscheduler_env.sh b/dockerfile/conf/dolphinscheduler/conf/env/.dolphinscheduler_env.sh new file mode 100644 index 0000000000..960d971dd8 --- /dev/null +++ b/dockerfile/conf/dolphinscheduler/conf/env/.dolphinscheduler_env.sh @@ -0,0 +1,20 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# + +export PYTHON_HOME=/usr/bin/python +export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 +export PATH=$PYTHON_HOME:$JAVA_HOME/bin:$PATH diff --git a/dockerfile/conf/dolphinscheduler/conf/env/.escheduler_env.sh b/dockerfile/conf/dolphinscheduler/conf/env/.escheduler_env.sh new file mode 100644 index 0000000000..5b85917fc2 --- /dev/null +++ b/dockerfile/conf/dolphinscheduler/conf/env/.escheduler_env.sh @@ -0,0 +1,20 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# + +export PYTHON_HOME=/usr/bin/python +export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 +export PATH=$PYTHON_HOME:$JAVA_HOME/bin:$PATH \ No newline at end of file diff --git a/escheduler-api/src/main/resources/i18n/messages.properties b/dockerfile/conf/dolphinscheduler/conf/i18n/messages.properties similarity index 93% rename from escheduler-api/src/main/resources/i18n/messages.properties rename to dockerfile/conf/dolphinscheduler/conf/i18n/messages.properties index 44787fd78f..be880ba26d 100644 --- a/escheduler-api/src/main/resources/i18n/messages.properties +++ b/dockerfile/conf/dolphinscheduler/conf/i18n/messages.properties @@ -1,3 +1,20 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# + QUERY_SCHEDULE_LIST_NOTES=query schedule list EXECUTE_PROCESS_TAG=execute process related operation PROCESS_INSTANCE_EXECUTOR_TAG=process instance executor related operation diff --git a/dockerfile/conf/escheduler/conf/i18n/messages_en_US.properties b/dockerfile/conf/dolphinscheduler/conf/i18n/messages_en_US.properties similarity index 93% rename from dockerfile/conf/escheduler/conf/i18n/messages_en_US.properties rename to dockerfile/conf/dolphinscheduler/conf/i18n/messages_en_US.properties index d06b83fed5..24c0843c10 100644 --- a/dockerfile/conf/escheduler/conf/i18n/messages_en_US.properties +++ b/dockerfile/conf/dolphinscheduler/conf/i18n/messages_en_US.properties @@ -1,3 +1,20 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# + QUERY_SCHEDULE_LIST_NOTES=query schedule list EXECUTE_PROCESS_TAG=execute process related operation PROCESS_INSTANCE_EXECUTOR_TAG=process instance executor related operation diff --git a/escheduler-api/src/main/resources/i18n/messages_zh_CN.properties b/dockerfile/conf/dolphinscheduler/conf/i18n/messages_zh_CN.properties similarity index 92% rename from escheduler-api/src/main/resources/i18n/messages_zh_CN.properties rename to dockerfile/conf/dolphinscheduler/conf/i18n/messages_zh_CN.properties index 46b0270747..5f24a6fedd 100644 --- a/escheduler-api/src/main/resources/i18n/messages_zh_CN.properties +++ b/dockerfile/conf/dolphinscheduler/conf/i18n/messages_zh_CN.properties @@ -1,3 +1,20 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# + QUERY_SCHEDULE_LIST_NOTES=查询定时列表 PROCESS_INSTANCE_EXECUTOR_TAG=流程实例执行相关操作 RUN_PROCESS_INSTANCE_NOTES=运行流程实例 diff --git a/dockerfile/conf/dolphinscheduler/conf/mail_templates/alert_mail_template.ftl b/dockerfile/conf/dolphinscheduler/conf/mail_templates/alert_mail_template.ftl new file mode 100644 index 0000000000..c638609090 --- /dev/null +++ b/dockerfile/conf/dolphinscheduler/conf/mail_templates/alert_mail_template.ftl @@ -0,0 +1,17 @@ +<#-- + ~ Licensed to the Apache Software Foundation (ASF) under one or more + ~ contributor license agreements. See the NOTICE file distributed with + ~ this work for additional information regarding copyright ownership. + ~ The ASF licenses this file to You under the Apache License, Version 2.0 + ~ (the "License"); you may not use this file except in compliance with + ~ the License. You may obtain a copy of the License at + ~ + ~ http://www.apache.org/licenses/LICENSE-2.0 + ~ + ~ Unless required by applicable law or agreed to in writing, software + ~ distributed under the License is distributed on an "AS IS" BASIS, + ~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + ~ See the License for the specific language governing permissions and + ~ limitations under the License. +--> + dolphinscheduler<#if title??> ${title}<#if content??> ${content}
\ No newline at end of file diff --git a/dockerfile/conf/dolphinscheduler/conf/master.properties b/dockerfile/conf/dolphinscheduler/conf/master.properties new file mode 100644 index 0000000000..73c29a2db2 --- /dev/null +++ b/dockerfile/conf/dolphinscheduler/conf/master.properties @@ -0,0 +1,38 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# + +# master execute thread num +master.exec.threads=100 + +# master execute task number in parallel +master.exec.task.number=20 + +# master heartbeat interval +master.heartbeat.interval=10 + +# master commit task retry times +master.task.commit.retryTimes=5 + +# master commit task interval +master.task.commit.interval=100 + + +# only less than cpu avg load, master server can work. default value : the number of cpu cores * 2 +#master.max.cpuload.avg=100 + +# only larger than reserved memory, master server can work. default value : physical memory * 1/10, unit is G. +master.reserved.memory=0.1 diff --git a/dockerfile/conf/dolphinscheduler/conf/master_logback.xml b/dockerfile/conf/dolphinscheduler/conf/master_logback.xml new file mode 100644 index 0000000000..12bcd658e1 --- /dev/null +++ b/dockerfile/conf/dolphinscheduler/conf/master_logback.xml @@ -0,0 +1,52 @@ + + + + + + + + + + [%level] %date{yyyy-MM-dd HH:mm:ss.SSS} %logger{96}:[%line] - %msg%n + + UTF-8 + + + + + ${log.base}/dolphinscheduler-master.log + + INFO + + + ${log.base}/dolphinscheduler-master.%d{yyyy-MM-dd_HH}.%i.log + 168 + 200MB + + + + [%level] %date{yyyy-MM-dd HH:mm:ss.SSS} %logger{96}:[%line] - %msg%n + + UTF-8 + + + + + + + \ No newline at end of file diff --git a/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/AccessTokenMapper.xml b/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/AccessTokenMapper.xml new file mode 100644 index 0000000000..29c8dfa5a3 --- /dev/null +++ b/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/AccessTokenMapper.xml @@ -0,0 +1,33 @@ + + + + + + + \ No newline at end of file diff --git a/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/AlertGroupMapper.xml b/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/AlertGroupMapper.xml new file mode 100644 index 0000000000..8ee335b6ff --- /dev/null +++ b/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/AlertGroupMapper.xml @@ -0,0 +1,47 @@ + + + + + + + + + + + \ No newline at end of file diff --git a/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/AlertMapper.xml b/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/AlertMapper.xml new file mode 100644 index 0000000000..703b685157 --- /dev/null +++ b/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/AlertMapper.xml @@ -0,0 +1,26 @@ + + + + + + + \ No newline at end of file diff --git a/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/CommandMapper.xml b/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/CommandMapper.xml new file mode 100644 index 0000000000..66e6c3edd3 --- /dev/null +++ b/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/CommandMapper.xml @@ -0,0 +1,43 @@ + + + + + + + + \ No newline at end of file diff --git a/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/DataSourceMapper.xml b/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/DataSourceMapper.xml new file mode 100644 index 0000000000..b296d5fc3e --- /dev/null +++ b/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/DataSourceMapper.xml @@ -0,0 +1,79 @@ + + + + + + + + + + + + + + + \ No newline at end of file diff --git a/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/DataSourceUserMapper.xml b/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/DataSourceUserMapper.xml new file mode 100644 index 0000000000..a43cbeca91 --- /dev/null +++ b/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/DataSourceUserMapper.xml @@ -0,0 +1,30 @@ + + + + + + + delete from t_ds_relation_datasource_user + where user_id = #{userId} + + + + delete from t_ds_relation_datasource_user + where datasource_id = #{datasourceId} + + \ No newline at end of file diff --git a/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/ErrorCommandMapper.xml b/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/ErrorCommandMapper.xml new file mode 100644 index 0000000000..2f5ae7104a --- /dev/null +++ b/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/ErrorCommandMapper.xml @@ -0,0 +1,36 @@ + + + + + + + \ No newline at end of file diff --git a/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/ProcessDefinitionMapper.xml b/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/ProcessDefinitionMapper.xml new file mode 100644 index 0000000000..1b97c07676 --- /dev/null +++ b/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/ProcessDefinitionMapper.xml @@ -0,0 +1,96 @@ + + + + + + + + + + + + + + \ No newline at end of file diff --git a/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/ProcessInstanceMapMapper.xml b/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/ProcessInstanceMapMapper.xml new file mode 100644 index 0000000000..d217665eab --- /dev/null +++ b/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/ProcessInstanceMapMapper.xml @@ -0,0 +1,43 @@ + + + + + + + delete + from t_ds_relation_process_instance + where parent_process_instance_id=#{parentProcessId} + + + + + + \ No newline at end of file diff --git a/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/ProcessInstanceMapper.xml b/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/ProcessInstanceMapper.xml new file mode 100644 index 0000000000..2e63867d33 --- /dev/null +++ b/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/ProcessInstanceMapper.xml @@ -0,0 +1,182 @@ + + + + + + + + + + + + + + + update t_ds_process_instance + set host=null + where host =#{host} and state in + + #{i} + + + + update t_ds_process_instance + set state = #{destState} + where state = #{originState} + + + + update t_ds_process_instance + set tenant_id = #{destTenantId} + where tenant_id = #{originTenantId} + + + + update t_ds_process_instance + set worker_group_id = #{destWorkerGroupId} + where worker_group_id = #{originWorkerGroupId} + + + + + + + + + + \ No newline at end of file diff --git a/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/ProjectMapper.xml b/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/ProjectMapper.xml new file mode 100644 index 0000000000..5ab0756250 --- /dev/null +++ b/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/ProjectMapper.xml @@ -0,0 +1,68 @@ + + + + + + + + + + + + + \ No newline at end of file diff --git a/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/ProjectUserMapper.xml b/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/ProjectUserMapper.xml new file mode 100644 index 0000000000..006cf080eb --- /dev/null +++ b/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/ProjectUserMapper.xml @@ -0,0 +1,36 @@ + + + + + + + delete from t_ds_relation_project_user + where 1=1 + and user_id = #{userId} + + and project_id = #{projectId} + + + + \ No newline at end of file diff --git a/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/QueueMapper.xml b/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/QueueMapper.xml new file mode 100644 index 0000000000..423b0dd04d --- /dev/null +++ b/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/QueueMapper.xml @@ -0,0 +1,42 @@ + + + + + + + + + \ No newline at end of file diff --git a/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/ResourceMapper.xml b/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/ResourceMapper.xml new file mode 100644 index 0000000000..146daa0632 --- /dev/null +++ b/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/ResourceMapper.xml @@ -0,0 +1,74 @@ + + + + + + + + + + + + diff --git a/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/ResourceUserMapper.xml b/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/ResourceUserMapper.xml new file mode 100644 index 0000000000..6a89e47c2f --- /dev/null +++ b/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/ResourceUserMapper.xml @@ -0,0 +1,32 @@ + + + + + + + delete + from t_ds_relation_resources_user + where 1 = 1 + + and user_id = #{userId} + + + and resources_id = #{resourceId} + + + \ No newline at end of file diff --git a/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/ScheduleMapper.xml b/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/ScheduleMapper.xml new file mode 100644 index 0000000000..402c864251 --- /dev/null +++ b/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/ScheduleMapper.xml @@ -0,0 +1,58 @@ + + + + + + + + + + \ No newline at end of file diff --git a/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/SessionMapper.xml b/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/SessionMapper.xml new file mode 100644 index 0000000000..4fa7f309dc --- /dev/null +++ b/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/SessionMapper.xml @@ -0,0 +1,32 @@ + + + + + + + + + \ No newline at end of file diff --git a/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/TaskInstanceMapper.xml b/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/TaskInstanceMapper.xml new file mode 100644 index 0000000000..3a1fddd288 --- /dev/null +++ b/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/TaskInstanceMapper.xml @@ -0,0 +1,129 @@ + + + + + + + update t_ds_task_instance + set state = #{destStatus} + where host = #{host} + and state in + + #{i} + + + + + + + + + + diff --git a/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/TenantMapper.xml b/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/TenantMapper.xml new file mode 100644 index 0000000000..fc9219ce86 --- /dev/null +++ b/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/TenantMapper.xml @@ -0,0 +1,41 @@ + + + + + + + + + \ No newline at end of file diff --git a/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/UDFUserMapper.xml b/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/UDFUserMapper.xml new file mode 100644 index 0000000000..61b4e2c372 --- /dev/null +++ b/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/UDFUserMapper.xml @@ -0,0 +1,29 @@ + + + + + + + delete from t_ds_relation_udfs_user + where user_id = #{userId} + + + delete from t_ds_relation_udfs_user + where udf_id = #{udfFuncId} + + \ No newline at end of file diff --git a/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/UdfFuncMapper.xml b/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/UdfFuncMapper.xml new file mode 100644 index 0000000000..04926d132e --- /dev/null +++ b/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/UdfFuncMapper.xml @@ -0,0 +1,71 @@ + + + + + + + + + + + \ No newline at end of file diff --git a/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/UserAlertGroupMapper.xml b/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/UserAlertGroupMapper.xml new file mode 100644 index 0000000000..cbb448275c --- /dev/null +++ b/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/UserAlertGroupMapper.xml @@ -0,0 +1,31 @@ + + + + + + + delete from t_ds_relation_user_alertgroup + where alertgroup_id = #{alertgroupId} + + + \ No newline at end of file diff --git a/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/UserMapper.xml b/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/UserMapper.xml new file mode 100644 index 0000000000..6046ad22eb --- /dev/null +++ b/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/UserMapper.xml @@ -0,0 +1,72 @@ + + + + + + + + + + + + + + + \ No newline at end of file diff --git a/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/WorkerGroupMapper.xml b/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/WorkerGroupMapper.xml new file mode 100644 index 0000000000..84dd4db88d --- /dev/null +++ b/dockerfile/conf/dolphinscheduler/conf/org/apache/dolphinscheduler/dao/mapper/WorkerGroupMapper.xml @@ -0,0 +1,40 @@ + + + + + + + + + \ No newline at end of file diff --git a/dockerfile/conf/escheduler/conf/quartz.properties b/dockerfile/conf/dolphinscheduler/conf/quartz.properties similarity index 55% rename from dockerfile/conf/escheduler/conf/quartz.properties rename to dockerfile/conf/dolphinscheduler/conf/quartz.properties index 21c5feb321..21ebd5e29d 100644 --- a/dockerfile/conf/escheduler/conf/quartz.properties +++ b/dockerfile/conf/dolphinscheduler/conf/quartz.properties @@ -1,7 +1,24 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# + #============================================================================ # Configure Main Scheduler Properties #============================================================================ -org.quartz.scheduler.instanceName = EasyScheduler +org.quartz.scheduler.instanceName = DolphinScheduler org.quartz.scheduler.instanceId = AUTO org.quartz.scheduler.makeSchedulerThreadDaemon = true org.quartz.jobStore.useProperties = false @@ -18,9 +35,9 @@ org.quartz.threadPool.threadPriority = 5 #============================================================================ # Configure JobStore #============================================================================ - + org.quartz.jobStore.class = org.quartz.impl.jdbcjobstore.JobStoreTX -org.quartz.jobStore.driverDelegateClass = org.quartz.impl.jdbcjobstore.StdJDBCDelegate +org.quartz.jobStore.driverDelegateClass = org.quartz.impl.jdbcjobstore.PostgreSQLDelegate org.quartz.jobStore.tablePrefix = QRTZ_ org.quartz.jobStore.isClustered = true org.quartz.jobStore.misfireThreshold = 60000 @@ -28,12 +45,12 @@ org.quartz.jobStore.clusterCheckinInterval = 5000 org.quartz.jobStore.dataSource = myDs #============================================================================ -# Configure Datasources +# Configure Datasources #============================================================================ - -org.quartz.dataSource.myDs.driver = com.mysql.jdbc.Driver -org.quartz.dataSource.myDs.URL=jdbc:mysql://127.0.0.1:3306/escheduler?characterEncoding=utf8 +org.quartz.dataSource.myDs.connectionProvider.class = org.apache.dolphinscheduler.server.quartz.DruidConnectionProvider +org.quartz.dataSource.myDs.driver = org.postgresql.Driver +org.quartz.dataSource.myDs.URL=jdbc:postgresql://127.0.0.1:5432/dolphinscheduler org.quartz.dataSource.myDs.user=root org.quartz.dataSource.myDs.password=root@123 org.quartz.dataSource.myDs.maxConnections = 10 -org.quartz.dataSource.myDs.validationQuery = select 1 \ No newline at end of file +org.quartz.dataSource.myDs.validationQuery = select 1 diff --git a/dockerfile/conf/dolphinscheduler/conf/worker.properties b/dockerfile/conf/dolphinscheduler/conf/worker.properties new file mode 100644 index 0000000000..582bb953f0 --- /dev/null +++ b/dockerfile/conf/dolphinscheduler/conf/worker.properties @@ -0,0 +1,32 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# + +# worker execute thread num +worker.exec.threads=100 + +# worker heartbeat interval +worker.heartbeat.interval=10 + +# submit the number of tasks at a time +worker.fetch.task.num = 3 + + +# only less than cpu avg load, worker server can work. default value : the number of cpu cores * 2 +#worker.max.cpuload.avg=10 + +# only larger than reserved memory, worker server can work. default value : physical memory * 1/6, unit is G. +worker.reserved.memory=1 \ No newline at end of file diff --git a/dockerfile/conf/escheduler/conf/worker_logback.xml b/dockerfile/conf/dolphinscheduler/conf/worker_logback.xml similarity index 58% rename from dockerfile/conf/escheduler/conf/worker_logback.xml rename to dockerfile/conf/dolphinscheduler/conf/worker_logback.xml index f630559da9..9bbd9615c4 100644 --- a/dockerfile/conf/escheduler/conf/worker_logback.xml +++ b/dockerfile/conf/dolphinscheduler/conf/worker_logback.xml @@ -1,3 +1,21 @@ + + + @@ -13,9 +31,10 @@ INFO - - + + taskAppId + ${log.base} @@ -32,13 +51,13 @@ - ${log.base}/escheduler-worker.log - + ${log.base}/dolphinscheduler-worker.log + INFO - ${log.base}/escheduler-worker.%d{yyyy-MM-dd_HH}.%i.log + ${log.base}/dolphinscheduler-worker.%d{yyyy-MM-dd_HH}.%i.log 168 200MB diff --git a/dockerfile/conf/dolphinscheduler/conf/zookeeper.properties b/dockerfile/conf/dolphinscheduler/conf/zookeeper.properties new file mode 100644 index 0000000000..5e9df1c863 --- /dev/null +++ b/dockerfile/conf/dolphinscheduler/conf/zookeeper.properties @@ -0,0 +1,42 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# + +#zookeeper cluster +zookeeper.quorum=127.0.0.1:2181 + +#dolphinscheduler root directory +zookeeper.dolphinscheduler.root=/dolphinscheduler + +#zookeeper server dirctory +zookeeper.dolphinscheduler.dead.servers=/dolphinscheduler/dead-servers +zookeeper.dolphinscheduler.masters=/dolphinscheduler/masters +zookeeper.dolphinscheduler.workers=/dolphinscheduler/workers + +#zookeeper lock dirctory +zookeeper.dolphinscheduler.lock.masters=/dolphinscheduler/lock/masters +zookeeper.dolphinscheduler.lock.workers=/dolphinscheduler/lock/workers + +#dolphinscheduler failover directory +zookeeper.dolphinscheduler.lock.failover.masters=/dolphinscheduler/lock/failover/masters +zookeeper.dolphinscheduler.lock.failover.workers=/dolphinscheduler/lock/failover/workers +zookeeper.dolphinscheduler.lock.failover.startup.masters=/dolphinscheduler/lock/failover/startup-masters + +#dolphinscheduler failover directory +zookeeper.session.timeout=300 +zookeeper.connection.timeout=300 +zookeeper.retry.sleep=1000 +zookeeper.retry.maxtime=5 \ No newline at end of file diff --git a/dockerfile/conf/escheduler/conf/alert.properties b/dockerfile/conf/escheduler/conf/alert.properties deleted file mode 100644 index df7d8372d7..0000000000 --- a/dockerfile/conf/escheduler/conf/alert.properties +++ /dev/null @@ -1,30 +0,0 @@ -#alert type is EMAIL/SMS -alert.type=EMAIL - -# mail server configuration -mail.protocol=SMTP -mail.server.host=smtp.office365.com -mail.server.port=587 -mail.sender=qiaozhanwei@outlook.com -mail.passwd=eschedulerBJEG - -# TLS -mail.smtp.starttls.enable=true -# SSL -mail.smtp.ssl.enable=false - -#xls file path,need create if not exist -xls.file.path=/tmp/xls - -# Enterprise WeChat configuration -enterprise.wechat.corp.id=xxxxxxx -enterprise.wechat.secret=xxxxxxx -enterprise.wechat.agent.id=xxxxxxx -enterprise.wechat.users=xxxxxxx -enterprise.wechat.token.url=https://qyapi.weixin.qq.com/cgi-bin/gettoken?corpid=$corpId&corpsecret=$secret -enterprise.wechat.push.url=https://qyapi.weixin.qq.com/cgi-bin/message/send?access_token=$token -enterprise.wechat.team.send.msg={\"toparty\":\"$toParty\",\"agentid\":\"$agentId\",\"msgtype\":\"text\",\"text\":{\"content\":\"$msg\"},\"safe\":\"0\"} -enterprise.wechat.user.send.msg={\"touser\":\"$toUser\",\"agentid\":\"$agentId\",\"msgtype\":\"markdown\",\"markdown\":{\"content\":\"$msg\"}} - - - diff --git a/dockerfile/conf/escheduler/conf/alert_logback.xml b/dockerfile/conf/escheduler/conf/alert_logback.xml deleted file mode 100644 index c4ca8e9d1f..0000000000 --- a/dockerfile/conf/escheduler/conf/alert_logback.xml +++ /dev/null @@ -1,31 +0,0 @@ - - - - - - - [%level] %date{yyyy-MM-dd HH:mm:ss.SSS} %logger{96}:[%line] - %msg%n - - UTF-8 - - - - - ${log.base}/escheduler-alert.log - - ${log.base}/escheduler-alert.%d{yyyy-MM-dd_HH}.%i.log - 20 - 64MB - - - - [%level] %date{yyyy-MM-dd HH:mm:ss.SSS} %logger{96}:[%line] - %msg%n - - UTF-8 - - - - - - - \ No newline at end of file diff --git a/dockerfile/conf/escheduler/conf/application.properties b/dockerfile/conf/escheduler/conf/application.properties deleted file mode 100644 index b817c18a4a..0000000000 --- a/dockerfile/conf/escheduler/conf/application.properties +++ /dev/null @@ -1,19 +0,0 @@ -# server port -server.port=12345 - -# session config -server.servlet.session.timeout=7200 - -server.servlet.context-path=/escheduler/ - -# file size limit for upload -spring.servlet.multipart.max-file-size=1024MB -spring.servlet.multipart.max-request-size=1024MB - -#post content -server.jetty.max-http-post-size=5000000 - -spring.messages.encoding=UTF-8 - -#i18n classpath folder , file prefix messages, if have many files, use "," seperator -spring.messages.basename=i18n/messages diff --git a/dockerfile/conf/escheduler/conf/application_master.properties b/dockerfile/conf/escheduler/conf/application_master.properties deleted file mode 100644 index cc4774ae94..0000000000 --- a/dockerfile/conf/escheduler/conf/application_master.properties +++ /dev/null @@ -1 +0,0 @@ -logging.config=classpath:master_logback.xml diff --git a/dockerfile/conf/escheduler/conf/common/hadoop/hadoop.properties b/dockerfile/conf/escheduler/conf/common/hadoop/hadoop.properties deleted file mode 100644 index 81452a83a2..0000000000 --- a/dockerfile/conf/escheduler/conf/common/hadoop/hadoop.properties +++ /dev/null @@ -1,18 +0,0 @@ -# ha or single namenode,If namenode ha needs to copy core-site.xml and hdfs-site.xml -# to the conf directory,support s3,for example : s3a://escheduler -fs.defaultFS=hdfs://mycluster:8020 - -# s3 need,s3 endpoint -fs.s3a.endpoint=http://192.168.199.91:9010 - -# s3 need,s3 access key -fs.s3a.access.key=A3DXS30FO22544RE - -# s3 need,s3 secret key -fs.s3a.secret.key=OloCLq3n+8+sdPHUhJ21XrSxTC+JK - -#resourcemanager ha note this need ips , this empty if single -yarn.resourcemanager.ha.rm.ids=192.168.xx.xx,192.168.xx.xx - -# If it is a single resourcemanager, you only need to configure one host name. If it is resourcemanager HA, the default configuration is fine -yarn.application.status.address=http://ark1:8088/ws/v1/cluster/apps/%s \ No newline at end of file diff --git a/dockerfile/conf/escheduler/conf/config/install_config.conf b/dockerfile/conf/escheduler/conf/config/install_config.conf deleted file mode 100644 index 43b955d4f1..0000000000 --- a/dockerfile/conf/escheduler/conf/config/install_config.conf +++ /dev/null @@ -1,3 +0,0 @@ -installPath=/data1_1T/escheduler -deployUser=escheduler -ips=ark0,ark1,ark2,ark3,ark4 diff --git a/dockerfile/conf/escheduler/conf/config/run_config.conf b/dockerfile/conf/escheduler/conf/config/run_config.conf deleted file mode 100644 index f4cfd832c4..0000000000 --- a/dockerfile/conf/escheduler/conf/config/run_config.conf +++ /dev/null @@ -1,4 +0,0 @@ -masters=ark0,ark1 -workers=ark2,ark3,ark4 -alertServer=ark3 -apiServers=ark1 \ No newline at end of file diff --git a/dockerfile/conf/escheduler/conf/env/.escheduler_env.sh b/dockerfile/conf/escheduler/conf/env/.escheduler_env.sh deleted file mode 100644 index 75362d494d..0000000000 --- a/dockerfile/conf/escheduler/conf/env/.escheduler_env.sh +++ /dev/null @@ -1,3 +0,0 @@ -export PYTHON_HOME=/usr/bin/python -export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 -export PATH=$PYTHON_HOME:$JAVA_HOME/bin:$PATH \ No newline at end of file diff --git a/dockerfile/conf/escheduler/conf/mail_templates/alert_mail_template.ftl b/dockerfile/conf/escheduler/conf/mail_templates/alert_mail_template.ftl deleted file mode 100644 index 0ff763fa28..0000000000 --- a/dockerfile/conf/escheduler/conf/mail_templates/alert_mail_template.ftl +++ /dev/null @@ -1 +0,0 @@ - easyscheduler<#if title??> ${title}<#if content??> ${content}
\ No newline at end of file diff --git a/dockerfile/conf/escheduler/conf/master.properties b/dockerfile/conf/escheduler/conf/master.properties deleted file mode 100644 index 9080defc7b..0000000000 --- a/dockerfile/conf/escheduler/conf/master.properties +++ /dev/null @@ -1,21 +0,0 @@ -# master execute thread num -master.exec.threads=100 - -# master execute task number in parallel -master.exec.task.number=20 - -# master heartbeat interval -master.heartbeat.interval=10 - -# master commit task retry times -master.task.commit.retryTimes=5 - -# master commit task interval -master.task.commit.interval=100 - - -# only less than cpu avg load, master server can work. default value : the number of cpu cores * 2 -master.max.cpuload.avg=10 - -# only larger than reserved memory, master server can work. default value : physical memory * 1/10, unit is G. -master.reserved.memory=1 diff --git a/dockerfile/conf/escheduler/conf/master_logback.xml b/dockerfile/conf/escheduler/conf/master_logback.xml deleted file mode 100644 index d93878218e..0000000000 --- a/dockerfile/conf/escheduler/conf/master_logback.xml +++ /dev/null @@ -1,34 +0,0 @@ - - - - - - - [%level] %date{yyyy-MM-dd HH:mm:ss.SSS} %logger{96}:[%line] - %msg%n - - UTF-8 - - - - - ${log.base}/escheduler-master.log - - INFO - - - ${log.base}/escheduler-master.%d{yyyy-MM-dd_HH}.%i.log - 168 - 200MB - - - - [%level] %date{yyyy-MM-dd HH:mm:ss.SSS} %logger{96}:[%line] - %msg%n - - UTF-8 - - - - - - - \ No newline at end of file diff --git a/dockerfile/conf/escheduler/conf/worker.properties b/dockerfile/conf/escheduler/conf/worker.properties deleted file mode 100644 index e58bd86dcf..0000000000 --- a/dockerfile/conf/escheduler/conf/worker.properties +++ /dev/null @@ -1,15 +0,0 @@ -# worker execute thread num -worker.exec.threads=100 - -# worker heartbeat interval -worker.heartbeat.interval=10 - -# submit the number of tasks at a time -worker.fetch.task.num = 3 - - -# only less than cpu avg load, worker server can work. default value : the number of cpu cores * 2 -#worker.max.cpuload.avg=10 - -# only larger than reserved memory, worker server can work. default value : physical memory * 1/6, unit is G. -worker.reserved.memory=1 \ No newline at end of file diff --git a/dockerfile/conf/escheduler/conf/zookeeper.properties b/dockerfile/conf/escheduler/conf/zookeeper.properties deleted file mode 100644 index 5f14df49b7..0000000000 --- a/dockerfile/conf/escheduler/conf/zookeeper.properties +++ /dev/null @@ -1,25 +0,0 @@ -#zookeeper cluster -zookeeper.quorum=127.0.0.1:2181 - -#escheduler root directory -zookeeper.escheduler.root=/escheduler - -#zookeeper server dirctory -zookeeper.escheduler.dead.servers=/escheduler/dead-servers -zookeeper.escheduler.masters=/escheduler/masters -zookeeper.escheduler.workers=/escheduler/workers - -#zookeeper lock dirctory -zookeeper.escheduler.lock.masters=/escheduler/lock/masters -zookeeper.escheduler.lock.workers=/escheduler/lock/workers - -#escheduler failover directory -zookeeper.escheduler.lock.failover.masters=/escheduler/lock/failover/masters -zookeeper.escheduler.lock.failover.workers=/escheduler/lock/failover/workers -zookeeper.escheduler.lock.failover.startup.masters=/escheduler/lock/failover/startup-masters - -#escheduler failover directory -zookeeper.session.timeout=300 -zookeeper.connection.timeout=300 -zookeeper.retry.sleep=1000 -zookeeper.retry.maxtime=5 \ No newline at end of file diff --git a/dockerfile/conf/nginx/default.conf b/dockerfile/conf/nginx/dolphinscheduler.conf similarity index 51% rename from dockerfile/conf/nginx/default.conf rename to dockerfile/conf/nginx/dolphinscheduler.conf index 2d43c32b63..03f87e6b52 100644 --- a/dockerfile/conf/nginx/default.conf +++ b/dockerfile/conf/nginx/dolphinscheduler.conf @@ -1,13 +1,30 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# + server { listen 8888; server_name localhost; #charset koi8-r; #access_log /var/log/nginx/host.access.log main; location / { - root /opt/easyscheduler_source/escheduler-ui/dist; + root /opt/dolphinscheduler_source/dolphinscheduler-ui/dist; index index.html index.html; } - location /escheduler { + location /dolphinscheduler { proxy_pass http://127.0.0.1:12345; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; diff --git a/dockerfile/conf/zookeeper/zoo.cfg b/dockerfile/conf/zookeeper/zoo.cfg index a5a2c0bbe3..7980d37ae9 100644 --- a/dockerfile/conf/zookeeper/zoo.cfg +++ b/dockerfile/conf/zookeeper/zoo.cfg @@ -1,3 +1,20 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# + # The number of milliseconds of each tick tickTime=2000 # The number of ticks that the initial diff --git a/dockerfile/hooks/build b/dockerfile/hooks/build index 779c38e66f..8b7d5329dc 100644 --- a/dockerfile/hooks/build +++ b/dockerfile/hooks/build @@ -1,8 +1,24 @@ #!/bin/bash +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# -echo "------ escheduler start - build -------" +echo "------ dolphinscheduler start - build -------" printenv docker build --build-arg version=$version --build-arg tar_version=$tar_version -t $DOCKER_REPO:$version . -echo "------ escheduler end - build -------" +echo "------ dolphinscheduler end - build -------" diff --git a/dockerfile/hooks/push b/dockerfile/hooks/push index 7b98da1a8d..6146727d45 100644 --- a/dockerfile/hooks/push +++ b/dockerfile/hooks/push @@ -1,4 +1,20 @@ #!/bin/bash +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# echo "------ push start -------" printenv diff --git a/dockerfile/startup.sh b/dockerfile/startup.sh index 8aed67f394..cc98d07e57 100644 --- a/dockerfile/startup.sh +++ b/dockerfile/startup.sh @@ -1,78 +1,72 @@ -#! /bin/bash +#!/bin/bash +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# set -e -if [ `netstat -anop|grep mysql|wc -l` -gt 0 ];then - echo "MySQL is Running." -else - MYSQL_ROOT_PWD="root@123" - ESZ_DB="escheduler" - echo "启动mysql服务" - chown -R mysql:mysql /var/lib/mysql /var/run/mysqld - find /var/lib/mysql -type f -exec touch {} \; && service mysql restart $ sleep 10 - if [ ! -f /nohup.out ];then - echo "设置mysql密码" - mysql --user=root --password=root -e "UPDATE mysql.user set authentication_string=password('$MYSQL_ROOT_PWD') where user='root'; FLUSH PRIVILEGES;" + echo "start postgresql service" + /etc/init.d/postgresql restart + echo "create user and init db" + sudo -u postgres psql <<'ENDSSH' +create user root with password 'root@123'; +create database dolphinscheduler owner root; +grant all privileges on database dolphinscheduler to root; +\q +ENDSSH + echo "import sql data" + /opt/dolphinscheduler/script/create-dolphinscheduler.sh - echo "设置mysql权限" - mysql --user=root --password=$MYSQL_ROOT_PWD -e "GRANT ALL PRIVILEGES ON *.* TO 'root'@'%' IDENTIFIED BY '$MYSQL_ROOT_PWD' WITH GRANT OPTION; FLUSH PRIVILEGES;" - echo "创建escheduler数据库" - mysql --user=root --password=$MYSQL_ROOT_PWD -e "CREATE DATABASE IF NOT EXISTS \`$ESZ_DB\` CHARACTER SET utf8 COLLATE utf8_general_ci; FLUSH PRIVILEGES;" - echo "导入mysql数据" - nohup /opt/escheduler/script/create_escheduler.sh & - sleep 90 - fi - - if [ `mysql --user=root --password=$MYSQL_ROOT_PWD -s -r -e "SELECT count(TABLE_NAME) FROM information_schema.TABLES WHERE TABLE_SCHEMA='escheduler';" | grep -v count` -eq 38 ];then - echo "\`$ESZ_DB\` 表个数正确" - else - echo "\`$ESZ_DB\` 表个数不正确" - mysql --user=root --password=$MYSQL_ROOT_PWD -e "DROP DATABASE \`$ESZ_DB\`;" - echo "创建escheduler数据库" - mysql --user=root --password=$MYSQL_ROOT_PWD -e "CREATE DATABASE IF NOT EXISTS \`$ESZ_DB\` CHARACTER SET utf8 COLLATE utf8_general_ci; FLUSH PRIVILEGES;" - echo "导入mysql数据" - nohup /opt/escheduler/script/create_escheduler.sh & - sleep 90 - fi -fi - -/opt/zookeeper/bin/zkServer.sh restart +/opt/zookeeper/bin/zkServer.sh restart sleep 90 -echo "启动api-server" -/opt/escheduler/bin/escheduler-daemon.sh stop api-server -/opt/escheduler/bin/escheduler-daemon.sh start api-server +echo "start api-server" +/opt/dolphinscheduler/bin/dolphinscheduler-daemon.sh stop api-server +/opt/dolphinscheduler/bin/dolphinscheduler-daemon.sh start api-server -echo "启动master-server" -/opt/escheduler/bin/escheduler-daemon.sh stop master-server -python /opt/escheduler/script/del_zk_node.py 127.0.0.1 /escheduler/masters -/opt/escheduler/bin/escheduler-daemon.sh start master-server +echo "start master-server" +/opt/dolphinscheduler/bin/dolphinscheduler-daemon.sh stop master-server +python /opt/dolphinscheduler/script/del-zk-node.py 127.0.0.1 /dolphinscheduler/masters +/opt/dolphinscheduler/bin/dolphinscheduler-daemon.sh start master-server -echo "启动worker-server" -/opt/escheduler/bin/escheduler-daemon.sh stop worker-server -python /opt/escheduler/script/del_zk_node.py 127.0.0.1 /escheduler/workers -/opt/escheduler/bin/escheduler-daemon.sh start worker-server +echo "start worker-server" +/opt/dolphinscheduler/bin/dolphinscheduler-daemon.sh stop worker-server +python /opt/dolphinscheduler/script/del-zk-node.py 127.0.0.1 /dolphinscheduler/workers +/opt/dolphinscheduler/bin/dolphinscheduler-daemon.sh start worker-server -echo "启动logger-server" -/opt/escheduler/bin/escheduler-daemon.sh stop logger-server -/opt/escheduler/bin/escheduler-daemon.sh start logger-server +echo "start logger-server" +/opt/dolphinscheduler/bin/dolphinscheduler-daemon.sh stop logger-server +/opt/dolphinscheduler/bin/dolphinscheduler-daemon.sh start logger-server -echo "启动alert-server" -/opt/escheduler/bin/escheduler-daemon.sh stop alert-server -/opt/escheduler/bin/escheduler-daemon.sh start alert-server +echo "start alert-server" +/opt/dolphinscheduler/bin/dolphinscheduler-daemon.sh stop alert-server +/opt/dolphinscheduler/bin/dolphinscheduler-daemon.sh start alert-server -echo "启动nginx" +echo "start nginx" /etc/init.d/nginx stop nginx & - + while true do diff --git a/docs/en_US/1.0.1-release.md b/docs/en_US/1.0.1-release.md deleted file mode 100644 index 8613d9352e..0000000000 --- a/docs/en_US/1.0.1-release.md +++ /dev/null @@ -1,16 +0,0 @@ -Easy Scheduler Release 1.0.1 -=== -Easy Scheduler 1.0.1 is the second version in the 1.x series. The update is as follows: - -- 1,outlook TSL email support -- 2,servlet and protobuf jar conflict resolution -- 3,create a tenant and establish a Linux user at the same time -- 4,the re-run time is negative -- 5,stand-alone and cluster can be deployed with one click of install.sh -- 6,queue support interface added -- 7,escheduler.t_escheduler_queue added create_time and update_time fields - - - - - diff --git a/docs/en_US/1.0.2-release.md b/docs/en_US/1.0.2-release.md deleted file mode 100644 index 502dbf8f9b..0000000000 --- a/docs/en_US/1.0.2-release.md +++ /dev/null @@ -1,49 +0,0 @@ -Easy Scheduler Release 1.0.2 -=== -Easy Scheduler 1.0.2 is the third version in the 1.x series. This version adds scheduling open interfaces, worker grouping (the machine group for which the specified task runs), task flow and service monitoring, and support for oracle, clickhouse, etc., as follows: - -New features: -=== -- [[EasyScheduler-79](https://github.com/analysys/EasyScheduler/issues/79)] scheduling the open interface through the token mode, which can be operated through the api. -- [[EasyScheduler-138](https://github.com/analysys/EasyScheduler/issues/138)] can specify the machine (group) where the task runs. -- [[EasyScheduler-139](https://github.com/analysys/EasyScheduler/issues/139)] task Process Monitoring and Master, Worker, Zookeeper Operation Status Monitoring -- [[EasyScheduler-140](https://github.com/analysys/EasyScheduler/issues/140)] workflow Definition - Increase Process Timeout Alarm -- [[EasyScheduler-134](https://github.com/analysys/EasyScheduler/issues/134)] task type supports Oracle, CLICKHOUSE, SQLSERVER, IMPALA -- [[EasyScheduler-136](https://github.com/analysys/EasyScheduler/issues/136)] sql task node can independently select CC mail users -- [[EasyScheduler-141](https://github.com/analysys/EasyScheduler/issues/141)] user Management—Users can bind queues. The user queue level is higher than the tenant queue level. If the user queue is empty, look for the tenant queue. - - - -Enhanced: -=== -- [[EasyScheduler-154](https://github.com/analysys/EasyScheduler/issues/154)] Tenant code allows encoding of pure numbers or underscores - - -Repair: -=== -- [[EasyScheduler-135](https://github.com/analysys/EasyScheduler/issues/135)] Python task can specify python version - -- [[EasyScheduler-125](https://github.com/analysys/EasyScheduler/issues/125)] The mobile phone number in the user account does not recognize the opening of Unicom's latest number 166 - -- [[EasyScheduler-178](https://github.com/analysys/EasyScheduler/issues/178)] Fix subtle spelling mistakes in ProcessDao - -- [[EasyScheduler-129](https://github.com/analysys/EasyScheduler/issues/129)] Tenant code, underlined and other special characters cannot pass the check. - - -Thank: -=== -Last but not least, no new version was born without the contributions of the following partners: - -Baoqi , chubbyjiang , coreychen , chgxtony, cmdares , datuzi , dingchao, fanguanqun , 风清扬, gaojun416 , googlechorme, hyperknob , hujiang75277381 , huanzui , kinssun, ivivi727 ,jimmy, jiangzhx , kevin5210 , lidongdai , lshmouse , lenboo, lyf198972 , lgcareer , lzy305 , moranrr , millionfor , mazhong8808, programlief, qiaozhanwei , roy110 , swxchappy , sherlock111 , samz406 , swxchappy, qq389401879 , lzy305, vkingnew, William-GuoWei , woniulinux, yyl861, zhangxin1988, yangjiajun2014, yangqinlong, yangjiajun2014, zhzhenqin, zhangluck, zhanghaicheng1, zhuyizhizhi - -And many enthusiastic partners in the WeChat group! Thank you very much! - - - - - - - - - - diff --git a/docs/en_US/1.0.3-release.md b/docs/en_US/1.0.3-release.md deleted file mode 100644 index b87f894011..0000000000 --- a/docs/en_US/1.0.3-release.md +++ /dev/null @@ -1,30 +0,0 @@ -Easy Scheduler Release 1.0.3 -=== -Easy Scheduler 1.0.3 is the fourth version in the 1.x series. - -Enhanced: -=== -- [[EasyScheduler-482]](https://github.com/analysys/EasyScheduler/issues/482)sql task mail header added support for custom variables -- [[EasyScheduler-483]](https://github.com/analysys/EasyScheduler/issues/483)sql task failed to send mail, then this sql task is failed -- [[EasyScheduler-484]](https://github.com/analysys/EasyScheduler/issues/484)modify the replacement rule of the custom variable in the sql task, and support the replacement of multiple single quotes and double quotes. -- [[EasyScheduler-485]](https://github.com/analysys/EasyScheduler/issues/485)when creating a resource file, increase the verification that the resource file already exists on hdfs - -Repair: -=== -- [[EasyScheduler-198]](https://github.com/analysys/EasyScheduler/issues/198) the process definition list is sorted according to the timing status and update time -- [[EasyScheduler-419]](https://github.com/analysys/EasyScheduler/issues/419) fixes online creation of files, hdfs file is not created, but returns successfully -- [[EasyScheduler-481] ](https://github.com/analysys/EasyScheduler/issues/481)fixes the problem that the job does not exist at the same time. -- [[EasyScheduler-425]](https://github.com/analysys/EasyScheduler/issues/425) kills the kill of its child process when killing the task -- [[EasyScheduler-422]](https://github.com/analysys/EasyScheduler/issues/422) fixed an issue where the update time and size were not updated when updating resource files -- [[EasyScheduler-431]](https://github.com/analysys/EasyScheduler/issues/431) fixed an issue where deleting a tenant failed if hdfs was not started when the tenant was deleted -- [[EasyScheduler-485]](https://github.com/analysys/EasyScheduler/issues/486) the shell process exits, the yarn state is not final and waits for judgment. - -Thank: -=== -Last but not least, no new version was born without the contributions of the following partners: - -Baoqi, jimmy201602, samz406, petersear, millionfor, hyperknob, fanguanqun, yangqinlong, qq389401879, -feloxx, coding-now, hymzcn, nysyxxg, chgxtony - -And many enthusiastic partners in the WeChat group! Thank you very much! - diff --git a/docs/en_US/1.0.4-release.md b/docs/en_US/1.0.4-release.md deleted file mode 100644 index f7b1089cc9..0000000000 --- a/docs/en_US/1.0.4-release.md +++ /dev/null @@ -1,2 +0,0 @@ -# 1.0.4 release - diff --git a/docs/en_US/1.0.5-release.md b/docs/en_US/1.0.5-release.md deleted file mode 100644 index ce945e28b1..0000000000 --- a/docs/en_US/1.0.5-release.md +++ /dev/null @@ -1,2 +0,0 @@ -# 1.0.5 release - diff --git a/docs/en_US/1.1.0-release.md b/docs/en_US/1.1.0-release.md deleted file mode 100644 index c9ebe71503..0000000000 --- a/docs/en_US/1.1.0-release.md +++ /dev/null @@ -1,55 +0,0 @@ -Easy Scheduler Release 1.1.0 -=== -Easy Scheduler 1.1.0 is the first release in the 1.1.x series. - -New features: -=== -- [[EasyScheduler-391](https://github.com/analysys/EasyScheduler/issues/391)] run a process under a specified tenement user -- [[EasyScheduler-288](https://github.com/analysys/EasyScheduler/issues/288)] feature/qiye_weixin -- [[EasyScheduler-189](https://github.com/analysys/EasyScheduler/issues/189)] security support such as Kerberos -- [[EasyScheduler-398](https://github.com/analysys/EasyScheduler/issues/398)]dministrator, with tenants (install.sh set default tenant), can create resources, projects and data sources (limited to one administrator) -- [[EasyScheduler-293](https://github.com/analysys/EasyScheduler/issues/293)]click on the parameter selected when running the process, there is no place to view, no save -- [[EasyScheduler-401](https://github.com/analysys/EasyScheduler/issues/401)]timing is easy to time every second. After the timing is completed, you can display the next trigger time on the page. -- [[EasyScheduler-493](https://github.com/analysys/EasyScheduler/pull/493)]add datasource kerberos auth and FAQ modify and add resource upload s3 - - -Enhanced: -=== -- [[EasyScheduler-227](https://github.com/analysys/EasyScheduler/issues/227)] upgrade spring-boot to 2.1.x and spring to 5.x -- [[EasyScheduler-434](https://github.com/analysys/EasyScheduler/issues/434)] number of worker nodes zk and mysql are inconsistent -- [[EasyScheduler-435](https://github.com/analysys/EasyScheduler/issues/435)]authentication of the mailbox format -- [[EasyScheduler-441](https://github.com/analysys/EasyScheduler/issues/441)] prohibits running nodes from joining completed node detection -- [[EasyScheduler-400](https://github.com/analysys/EasyScheduler/issues/400)] Home page, queue statistics are not harmonious, command statistics have no data -- [[EasyScheduler-395](https://github.com/analysys/EasyScheduler/issues/395)] For fault-tolerant recovery processes, the status cannot be ** is running -- [[EasyScheduler-529](https://github.com/analysys/EasyScheduler/issues/529)] optimize poll task from zookeeper -- [[EasyScheduler-242](https://github.com/analysys/EasyScheduler/issues/242)]worker-server node gets task performance problem -- [[EasyScheduler-352](https://github.com/analysys/EasyScheduler/issues/352)]worker grouping, queue consumption problem -- [[EasyScheduler-461](https://github.com/analysys/EasyScheduler/issues/461)]view data source parameters, need to encrypt account password information -- [[EasyScheduler-396](https://github.com/analysys/EasyScheduler/issues/396)]Dockerfile optimization, and associated Dockerfile and github to achieve automatic mirroring -- [[EasyScheduler-389](https://github.com/analysys/EasyScheduler/issues/389)]service monitor cannot find the change of master/worker -- [[EasyScheduler-511](https://github.com/analysys/EasyScheduler/issues/511)]support recovery process from stop/kill nodes. -- [[EasyScheduler-399](https://github.com/analysys/EasyScheduler/issues/399)]HadoopUtils specifies user actions instead of **Deploying users - -Repair: -=== -- [[EasyScheduler-394](https://github.com/analysys/EasyScheduler/issues/394)] When the master&worker is deployed on the same machine, if the master&worker service is restarted, the previously scheduled tasks cannot be scheduled. -- [[EasyScheduler-469](https://github.com/analysys/EasyScheduler/issues/469)]Fix naming errors,monitor page -- [[EasyScheduler-392](https://github.com/analysys/EasyScheduler/issues/392)]Feature request: fix email regex check -- [[EasyScheduler-405](https://github.com/analysys/EasyScheduler/issues/405)]timed modification/addition page, start time and end time cannot be the same -- [[EasyScheduler-517](https://github.com/analysys/EasyScheduler/issues/517)]complement - subworkflow - time parameter -- [[EasyScheduler-532](https://github.com/analysys/EasyScheduler/issues/532)] python node does not execute the problem -- [[EasyScheduler-543](https://github.com/analysys/EasyScheduler/issues/543)]optimize datasource connection params safety -- [[EasyScheduler-569](https://github.com/analysys/EasyScheduler/issues/569)] timed tasks can't really stop -- [[EasyScheduler-463](https://github.com/analysys/EasyScheduler/issues/463)]mailbox verification does not support very suffixed mailboxes - - - - -Thank: -=== -Last but not least, no new version was born without the contributions of the following partners: - -Baoqi, jimmy201602, samz406, petersear, millionfor, hyperknob, fanguanqun, yangqinlong, qq389401879, chgxtony, Stanfan, lfyee, thisnew, hujiang75277381, sunnyingit, lgbo-ustc, ivivi, lzy305, JackIllkid, telltime, lipengbo2018, wuchunfu, telltime - -And many enthusiastic partners in the WeChat group! Thank you very much! - diff --git a/docs/en_US/EasyScheduler Proposal.md b/docs/en_US/EasyScheduler Proposal.md deleted file mode 100644 index 6bcea73540..0000000000 --- a/docs/en_US/EasyScheduler Proposal.md +++ /dev/null @@ -1,299 +0,0 @@ -# EasyScheduler Proposal - -## Abstract - -EasyScheduler is a distributed ETL scheduling engine with powerful DAG visualization interface. EasyScheduler focuses on solving the problem of 'complex task dependencies & triggers ' in data processing. Just like its name, we dedicated to making the scheduling system `out of the box`. - -## Proposal - -EasyScheduler provides many easy-to-use features to accelerate the engineer efficiency on data ETL workflow job. We propose a new concept of 'instance of process' and 'instance of task' to let developers to tuning their jobs on the running state of workflow instead of changing the task's template. Its main objectives are as follows: - -- Define the complex tasks' dependencies & triggers in a DAG graph by dragging and dropping. -- Support cluster HA. -- Support multi-tenant and parallel or serial backfilling data. -- Support automatical failure job retry and recovery. -- Support many data task types and process priority, task priority and relative task timeout alarm. - -For now, EasyScheduler has a fairly huge community in China. -It is also widely adopted by many [companies and organizations](https://github.com/analysys/EasyScheduler/issues/57) as its ETL scheduling tool. - -We believe that bringing EasyScheduler into ASF could advance development of a much more stronger and more diverse open source community. - -Analysys submits this proposal to donate EasyScheduler's source codes and all related documentations to Apache Software Foundation. -The codes are already under Apache License Version 2.0. - -- Code base: https://www.github.com/analysys/easyscheduler -- English Documentations: -- Chinese Documentations: - -## Background - -We want to find a data processing tool with the following features: - -- Easy to use,developers can build a ETL process with a very simple drag and drop operation. not only for ETL developers,people who can't write code also can use this tool for ETL operation such as system administrator. -- Solving the problem of "complex task dependencies" , and it can monitor the ETL running status. -- Support multi-tenant. -- Support many task types: Shell, MR, Spark, SQL (mysql, postgresql, hive, sparksql), Python, Sub_Process, Procedure, etc. -- Support HA and linear scalability. - -For the above reasons, we realized that no existing product met our requirements, so we decided to develop this tool ourselves. We designed EasyScheduler at the end of 2017. The first internal use version was completed in May 2018. We then iterated several internal versions and the system gradually became stabilized. - -Then we open the source code of EasyScheduler on March 2019. It soon gained lot's of ETL developers interest and stars on github. - -## Rationale - -Many organizations (>30) (refer to [Who is using EasyScheduler](https://github.com/analysys/EasyScheduler/issues/57) ) already benefit from running EasyScheduler to make data process pipelines more easier. More than 100 [feature ideas](https://github.com/analysys/EasyScheduler/projects/1) come from EasyScheduler community. Some 3rd-party projects also plan to integrate with EasyScheduler through task plugin, such as [Scriptis](https://github.com/WeBankFinTech/Scriptis), [waterdrop](https://github.com/InterestingLab/waterdrop). These will strengthen the features of EasyScheduler. - -## Current Status - -### Meritocracy - -EasyScheduler was incubated at Analysys in 2017 and open sourced on GitHub in March 2019. Once open sourced, we have been quickly adopted by multiple organizations,EasyScheduler has contributors and users from many companies; we have set up the Committer Team. New contributors are guided and reviewed by existed committer members. -Contributions are always welcomed and highly valued. - -### Community - -Now we have set development teams for EasyScheduler in Analysys, and we already have external developers who contributed the code. We already have a user group of more than 1,000 people. -We hope to grow the base of contributors by inviting all those who offer contributions through The Apache Way. -Right now, we make use of github as code hosting as well as gitter for community communication. - -### Core Developers - -The core developers, including experienced senior developers, are often guided by mentors. - -## Known Risks - -### Orphaned products - -EasyScheduler is widely adopted in China by many [companies and organizations](https://github.com/analysys/EasyScheduler/issues/57). The core developers of EasyScheduler team plan to work full time on this project. Currently there are 10 use cases with more that 1000 activity tasks per day using EasyScheduler in the user's production environment. There is very little risk of EasyScheduler getting orphaned as at least two large companies (xueqiu、fengjr) are widely using it in their production, and developers from these companies have also joined Easy Scheduler's team of contributors, EasyScheduler has eight major releases so far, and and received 373 pull requests from contributors, which further demonstrates EasyScheduler as a very active project. We also plan to extend and diversify this community further through Apache. - -Thus, it is very unlikely that EasyScheduler becomes orphaned. - -### Inexperience with Open Source - -EasyScheduler's core developers have been running it as a community-oriented open source project for some time, several of them already have experience working with open source communities, they are also active in presto, alluxio and other projects. At the same time, we will learn more open source experiences by following the Apache way in our incubator journey. - -### Homogeneous Developers - -The current developers work across a variety of organizations including Analysys, guandata and hydee; -some individual developers are accepted as developers of EasyScheduler as well. -Considering that fengjr and sefonsoft have shown great interests in EasyScheduler, we plan to encourage them to contribute and invite them as contributors to work together. - -### Reliance on Salaried Developers - -At present, eight of the core developers are paid by their employer to contribute to EasyScheduler project. -we also have some other developers and researchers taking part in the project, and we will make efforts to increase the diversity of the contributors and actively lobby for Domain experts in the workflow space to contribute. - -### Relationships with Other Apache Products - -EasyScheduler integrates Apache Zookeeper as one of the service registration/discovery mechanisms. EasyScheduler is deeply integrated with Apache products. It currently support many task types like Apache Hive, Apache Spark, Apache Hadoop, and so on - -### A Excessive Fascination with the Apache Brand - -We recognize the value and reputation that the Apache brand will bring to EasyScheduler. -However, we prefer that the community provided by the Apache Software Foundation will enable the project to achieve long-term stable development. so EasyScheduler is proposing to enter incubation at Apache in order to help efforts to diversify the community, not so much to capitalize on the Apache brand. - -## Documentation - -A complete set of EasyScheduler documentations is provided on github in both English and Simplified Chinese. - -- [English](https://github.com/analysys/easyscheduler_docs) -- [Chinese](https://github.com/analysys/easyscheduler_docs_cn) - -## Initial Source - -The project consists of three distinct codebases: core and document. The address of two existed git repositories are as follows: - -- -- -- - -## Source and Intellectual Property Submission Plan - -As soon as EasyScheduler is approved to join Apache Incubator, Analysys will provide the Software Grant Agreement(SGA) and initial committers will submit ICLA(s). The code is already licensed under the Apache Software License, version 2.0. - -## External Dependencies - -As all backend code dependencies are managed using Apache Maven, none of the external libraries need to be packaged in a source distribution. - -Most of dependencies have Apache compatible licenses,and the core dependencies are as follows: - -### Backend Dependency - -| Dependency | License | Comments | -| ------------------------------------------------------ | ------------------------------------------------------------ | ------------- | -| bonecp-0.8.0.RELEASE.jar | Apache v2.0 | | -| byte-buddy-1.9.10.jar | Apache V2.0 | | -| c3p0-0.9.1.1.jar | GNU LESSER GENERAL PUBLIC LICENSE | will remove | -| curator-*-2.12.0.jar | Apache V2.0 | | -| druid-1.1.14.jar | Apache V2.0 | | -| fastjson-1.2.29.jar | Apache V2.0 | | -| fastutil-6.5.6.jar | Apache V2.0 | | -| grpc-*-1.9.0.jar | Apache V2.0 | | -| gson-2.8.5.jar | Apache V2.0 | | -| guava-20.0.jar | Apache V2.0 | | -| guice-*3.0.jar | Apache V2.0 | | -| hadoop-*-2.7.3.jar | Apache V2.0 | | -| hbase-*-1.1.1.jar | Apache V2.0 | | -| hive-*-2.1.0.jar | Apache V2.0 | | -| instrumentation-api-0.4.3.jar | Apache V2.0 | | -| jackson-*-2.9.8.jar | Apache V2.0 | | -| jackson-jaxrs-1.8.3.jar | LGPL Version 2.1 Apache V2.0 | will remove | -| jackson-xc-1.8.3.jar | LGPL Version 2.1 Apache V2.0 | will remove | -| javax.activation-api-1.2.0.jar | CDDL/GPLv2+CE | will remove | -| javax.annotation-api-1.3.2.jar | CDDL + GPLv2 with classpath exception | will remove | -| javax.servlet-api-3.1.0.jar | CDDL + GPLv2 with classpath exception | will remove | -| jaxb-*.jar | (CDDL 1.1) (GPL2 w/ CPE) | will remove | -| jersey-*-1.9.jar | CDDL+GPLv2 | will remove | -| jetty-*-9.4.14.v20181114.jar | Apache V2.0,EPL 1.0 | | -| jna-4.5.2.jar | Apache V2.0,LGPL 2.1 | will remove | -| jna-platform-4.5.2.jar | Apache V2.0,LGPL 2.1 | will remove | -| jsp-api-2.x.jar | CDDL,GPL 2.0 | will remove | -| log4j-1.2.17.jar | Apache V2.0 | | -| log4j-*-2.11.2.jar | Apache V2.0 | | -| logback-x.jar | dual-license EPL 1.0,LGPL 2.1 | | -| mail-1.4.5.jar | CDDL+GPLv2 | will remove | -| mybatis-3.5.1.jar | Apache V2.0 | | -| mybatis-spring-*2.0.1.jar | Apache V2.0 | | -| mysql-connector-java-5.1.34.jar | GPL 2.0 | will remove | -| netty-*-4.1.33.Final.jar | Apache V2.0 | | -| oshi-core-3.5.0.jar | EPL 1.0 | | -| parquet-hadoop-bundle-1.8.1.jar | Apache V2.0 | | -| postgresql-42.1.4.jar | BSD 2-clause | | -| protobuf-java-*3.5.1.jar | BSD 3-clause | | -| quartz-2.2.3.jar | Apache V2.0 | | -| quartz-jobs-2.2.3.jar | Apache V2.0 | | -| slf4j-api-1.7.5.jar | MIT | | -| spring-*-5.1.5.RELEASE.jar | Apache V2.0 | | -| spring-beans-5.1.5.RELEASE.jar | Apache V2.0 | | -| spring-boot-*2.1.3.RELEASE.jar | Apache V2.0 | | -| springfox-*-2.9.2.jar | Apache V2.0 | | -| stringtemplate-3.2.1.jar | BSD | | -| swagger-annotations-1.5.20.jar | Apache V2.0 | | -| swagger-bootstrap-ui-1.9.3.jar | Apache V2.0 | | -| swagger-models-1.5.20.jar | Apache V2.0 | | -| zookeeper-3.4.8.jar | Apache | | - - - - -The front-end UI currently relies on many components, and the core dependencies are as follows: - -### UI Dependency - -| Dependency | License | Comments | -| ------------------------------------------------------- | ------------------------------------ | ----------- | -| autoprefixer | MIT | | -| babel-core | MIT | | -| babel-eslint | MIT | | -| babel-helper-* | MIT | | -| babel-helpers | MIT | | -| babel-loader | MIT | | -| babel-plugin-syntax-* | MIT | | -| babel-plugin-transform-* | MIT | | -| babel-preset-env | MIT | | -| babel-runtime | MIT | | -| bootstrap | MIT | | -| canvg | MIT | | -| clipboard | MIT | | -| codemirror | MIT | | -| copy-webpack-plugin | MIT | | -| cross-env | MIT | | -| css-loader | MIT | | -| cssnano | MIT | | -| cyclist | MIT | | -| d3 | BSD-3-Clause | | -| dayjs | MIT | | -| echarts | Apache V2.0 | | -| env-parse | ISC | | -| extract-text-webpack-plugin | MIT | | -| file-loader | MIT | | -| globby | MIT | | -| html-loader | MIT | | -| html-webpack-ext-plugin | MIT | | -| html-webpack-plugin | MIT | | -| html2canvas | MIT | | -| jsplumb | (MIT OR GPL-2.0) | | -| lodash | MIT | | -| node-sass | MIT | | -| optimize-css-assets-webpack-plugin | MIT | | -| postcss-loader | MIT | | -| rimraf | ISC | | -| sass-loader | MIT | | -| uglifyjs-webpack-plugin | MIT | | -| url-loader | MIT | | -| util.promisify | MIT | | -| vue | MIT | | -| vue-loader | MIT | | -| vue-style-loader | MIT | | -| vue-template-compiler | MIT | | -| vuex-router-sync | MIT | | -| watchpack | MIT | | -| webpack | MIT | | -| webpack-dev-server | MIT | | -| webpack-merge | MIT | | -| xmldom | MIT,LGPL | will remove | - - -## Required Resources - -### Git Repositories - -- -- -- - -### Issue Tracking - -The community would like to continue using GitHub Issues. - -### Continuous Integration tool - -Jenkins - -### Mailing Lists - -- EasyScheduler-dev: for development discussions -- EasyScheduler-private: for PPMC discussions -- EasyScheduler-notifications: for users notifications - -## Initial Committers - -- William-GuoWei(guowei20m@outlook.com) -- Lidong Dai(lidong.dai@outlook.com) -- Zhanwei Qiao(qiaozhanwei@outlook.com) -- Liang Bao(baoliang.leon@gmail.com) -- Gang Li(lgcareer2019@outlook.com) -- Zijian Gong(quanquansy@gmail.com) -- Jun Gao(gaojun2048@gmail.com) -- Baoqi Wu(wubaoqi@gmail.com) - -## Affiliations - -- Analysys Inc: William-GuoWei,Zhanwei Qiao,Liang Bao,Gang Li,Jun Gao,Lidong Dai - -- Hydee Inc: Zijian Gong - -- Guandata Inc: Baoqi Wu - - - -## Sponsors - -### Champion - -- Sheng Wu ( Apache Incubator PMC, [wusheng@apache.org](mailto:wusheng@apache.org)) - -### Mentors - -- Sheng Wu ( Apache Incubator PMC, [wusheng@apache.org](mailto:wusheng@apache.org)) - -- ShaoFeng Shi ( Apache Incubator PMC, [shaofengshi@apache.org](mailto:wusheng@apache.org)) - -- Liang Chen ( Apache Software Foundation Member, [chenliang613@apache.org](mailto:chenliang613@apache.org)) - - - -### Sponsoring Entity - -We are expecting the Apache Incubator could sponsor this project. diff --git a/docs/en_US/EasyScheduler-FAQ.md b/docs/en_US/EasyScheduler-FAQ.md deleted file mode 100644 index b55b0e2413..0000000000 --- a/docs/en_US/EasyScheduler-FAQ.md +++ /dev/null @@ -1,284 +0,0 @@ -## Q: EasyScheduler service introduction and recommended running memory - -A: EasyScheduler consists of 5 services, MasterServer, WorkerServer, ApiServer, AlertServer, LoggerServer and UI. - -| Service | Description | -| ------------------------- | ------------------------------------------------------------ | -| MasterServer | Mainly responsible for DAG segmentation and task status monitoring | -| WorkerServer/LoggerServer | Mainly responsible for the submission, execution and update of task status. LoggerServer is used for Rest Api to view logs through RPC | -| ApiServer | Provides the Rest Api service for the UI to call | -| AlertServer | Provide alarm service | -| UI | Front page display | - -Note:**Due to the large number of services, it is recommended that the single-machine deployment is preferably 4 cores and 16G or more.** - ---- - -## Q: Why can't an administrator create a project? - -A: The administrator is currently "**pure management**". There is no tenant, that is, there is no corresponding user on linux, so there is no execution permission, **so there is no project, resource and data source,** so there is no permission to create. **But there are all viewing permissions**. If you need to create a business operation such as a project, **use the administrator to create a tenant and a normal user, and then use the normal user login to operate**. We will release the administrator's creation and execution permissions in version 1.1.0, and the administrator will have all permissions. - ---- - -## Q: Which mailboxes does the system support? - -A: Support most mailboxes, qq, 163, 126, 139, outlook, aliyun, etc. are supported. Support TLS and SSL protocols, optionally configured in alert.properties - ---- - -## Q: What are the common system variable time parameters and how do I use them? - -A: Please refer to 'System parameter' in the system-manual - ---- - -## Q: pip install kazoo This installation gives an error. Is it necessary to install? - -A: This is the python connection zookeeper needs to use, must be installed - ---- - -## Q: How to specify the machine running task - -A: Use **the administrator** to create a Worker group, **specify the Worker group** when the **process definition starts**, or **specify the Worker group on the task node**. If not specified, use Default, **Default is to select one of all the workers in the cluster to use for task submission and execution.** - ---- - -## Q: Priority of the task - -A: We also support **the priority of processes and tasks**. Priority We have five levels of **HIGHEST, HIGH, MEDIUM, LOW and LOWEST**. **You can set the priority between different process instances, or you can set the priority of different task instances in the same process instance.** For details, please refer to the task priority design in the architecture-design. - ----- - -## Q: Escheduler-grpc gives an error - -A: Execute in the root directory: mvn -U clean package assembly:assembly -Dmaven.test.skip=true , then refresh the entire project - ----- - -## Q: Does EasyScheduler support running on windows? - -A: In theory, **only the Worker needs to run on Linux**. Other services can run normally on Windows. But it is still recommended to deploy on Linux. - ------ - -## Q: UI compiles node-sass prompt in linux: Error: EACCESS: permission denied, mkdir xxxx - -A: Install **npm install node-sass --unsafe-perm** separately, then **npm install** - ---- - -## Q: UI cannot log in normally. - -A: 1, if it is node startup, check whether the .env API_BASE configuration under escheduler-ui is the Api Server service address. - - 2, If it is nginx booted and installed via **install-escheduler-ui.sh**, check if the proxy_pass configuration in **/etc/nginx/conf.d/escheduler.conf** is the Api Server service. address - -  3, if the above configuration is correct, then please check if the Api Server service is normal, curl http://192.168.xx.xx:12345/escheduler/users/get-user-info, check the Api Server log, if Prompt cn.escheduler.api.interceptor.LoginHandlerInterceptor:[76] - session info is null, which proves that the Api Server service is normal. - - 4, if there is no problem above, you need to check if **server.context-path and server.port configuration** in **application.properties** is correct - ---- - -## Q: After the process definition is manually started or scheduled, no process instance is generated. - -A: 1, first **check whether the MasterServer service exists through jps**, or directly check whether there is a master service in zk from the service monitoring. - -​ 2,If there is a master service, check **the command status statistics** or whether new records are added in **t_escheduler_error_command**. If it is added, **please check the message field.** - ---- - -## Q : The task status is always in the successful submission status. - -A: 1, **first check whether the WorkerServer service exists through jps**, or directly check whether there is a worker service in zk from the service monitoring. - -​ 2,If the **WorkerServer** service is normal, you need to **check whether the MasterServer puts the task task in the zk queue. You need to check whether the task is blocked in the MasterServer log and the zk queue.** - -​ 3, if there is no problem above, you need to locate whether the Worker group is specified, but **the machine grouped by the worker is not online**.** - ---- - -## Q: Is there a Docker image and a Dockerfile? - -A: Provide Docker image and Dockerfile. - -Docker image address: https://hub.docker.com/r/escheduler/escheduler_images - -Dockerfile address: https://github.com/qiaozhanwei/escheduler_dockerfile/tree/master/docker_escheduler - ------- - -## Q : Need to pay attention to the problem in install.sh - -A: 1, if the replacement variable contains special characters, **use the \ transfer character to transfer** - -​ 2, installPath="/data1_1T/escheduler", **this directory can not be the same as the install.sh directory currently installed with one click.** - -​ 3, deployUser = "escheduler", **the deployment user must have sudo privileges**, because the worker is executed by sudo -u tenant sh xxx.command - -​ 4, monitorServerState = "false", whether the service monitoring script is started, the default is not to start the service monitoring script. **If the service monitoring script is started, the master and worker services are monitored every 5 minutes, and if the machine is down, it will automatically restart.** - -​ 5, hdfsStartupSate="false", whether to enable HDFS resource upload function. The default is not enabled. **If it is not enabled, the resource center cannot be used.** If enabled, you need to configure the configuration of fs.defaultFS and yarn in conf/common/hadoop/hadoop.properties. If you use namenode HA, you need to copy core-site.xml and hdfs-site.xml to the conf root directory. - -​ Note: **The 1.0.x version does not automatically create the hdfs root directory, you need to create it yourself, and you need to deploy the user with hdfs operation permission.** - ---- - -## Q : Process definition and process instance offline exception - -A : For **versions prior to 1.0.4**, modify the code under the escheduler-api cn.escheduler.api.quartz package. - -``` -public boolean deleteJob(String jobName, String jobGroupName) { - lock.writeLock().lock(); - try { - JobKey jobKey = new JobKey(jobName,jobGroupName); - if(scheduler.checkExists(jobKey)){ - logger.info("try to delete job, job name: {}, job group name: {},", jobName, jobGroupName); - return scheduler.deleteJob(jobKey); - }else { - return true; - } - - } catch (SchedulerException e) { - logger.error(String.format("delete job : %s failed",jobName), e); - } finally { - lock.writeLock().unlock(); - } - return false; - } -``` - ---- - -## Q: Can the tenant created before the HDFS startup use the resource center normally? - -A: No. Because the tenant created by HDFS is not started, the tenant directory will not be registered in HDFS. So the last resource will report an error. - -## Q: In the multi-master and multi-worker state, the service is lost, how to be fault-tolerant - -A: **Note:** **Master monitors Master and Worker services.** - -​ 1,If the Master service is lost, other Masters will take over the process of the hanged Master and continue to monitor the Worker task status. - -​ 2,If the Worker service is lost, the Master will monitor that the Worker service is gone. If there is a Yarn task, the Kill Yarn task will be retried. - -Please see the fault-tolerant design in the architecture for details. - ---- - -## Q : Fault tolerance for a machine distributed by Master and Worker - -A: The 1.0.3 version only implements the fault tolerance of the Master startup process, and does not take the Worker Fault Tolerance. That is to say, if the Worker hangs, no Master exists. There will be problems with this process. We will add Master and Worker startup fault tolerance in version **1.1.0** to fix this problem. If you want to manually modify this problem, you need to **modify the running task for the running worker task that is running the process across the restart and has been dropped. The running process is set to the failed state across the restart**. Then resume the process from the failed node. - ---- - -## Q : Timing is easy to set to execute every second - -A : Note when setting the timing. If the first digit (* * * * * ? *) is set to *, it means execution every second. **We will add a list of recently scheduled times in version 1.1.0.** You can see the last 5 running times online at http://cron.qqe2.com/ - - - -## Q: Is there a valid time range for timing? - -A: Yes, **if the timing start and end time is the same time, then this timing will be invalid timing. If the end time of the start and end time is smaller than the current time, it is very likely that the timing will be automatically deleted.** - - - -## Q : There are several implementations of task dependencies - -A: 1, the task dependency between **DAG**, is **from the zero degree** of the DAG segmentation - -​ 2, there are **task dependent nodes**, you can achieve cross-process tasks or process dependencies, please refer to the (DEPENDENT) node design in the system-manual. - -​ Note: **Cross-project processes or task dependencies are not supported** - -## Q: There are several ways to start the process definition. - -A: 1, in **the process definition list**, click the **Start** button. - -​ 2, **the process definition list adds a timer**, scheduling start process definition. - -​ 3, process definition **view or edit** the DAG page, any **task node right click** Start process definition. - -​ 4, you can define DAG editing for the process, set the running flag of some tasks to **prohibit running**, when the process definition is started, the connection of the node will be removed from the DAG. - - - -## Q : Python task setting Python version - -A: 1,**for the version after 1.0.3** only need to modify PYTHON_HOME in conf/env/.escheduler_env.sh - -``` -export PYTHON_HOME=/bin/python -``` - -Note: This is **PYTHON_HOME** , which is the absolute path of the python command, not the simple PYTHON_HOME. Also note that when exporting the PATH, you need to directly - -``` -export PATH=$HADOOP_HOME/bin:$SPARK_HOME1/bin:$SPARK_HOME2/bin:$PYTHON_HOME:$JAVA_HOME/bin:$HIVE_HOME/bin:$PATH -``` - -​ 2,For versions prior to 1.0.3, the Python task only supports the Python version of the system. It does not support specifying the Python version. - -## Q:Worker Task will generate a child process through sudo -u tenant sh xxx.command, will kill when kill - -A: We will add the kill task in 1.0.4 and kill all the various child processes generated by the task. - - - -## Q : How to use the queue in EasyScheduler, what does the user queue and tenant queue mean? - -A : The queue in the EasyScheduler can be configured on the user or the tenant. **The priority of the queue specified by the user is higher than the priority of the tenant queue.** For example, to specify a queue for an MR task, the queue is specified by mapreduce.job.queuename. - -Note: When using the above method to specify the queue, the MR uses the following methods: - -``` - Configuration conf = new Configuration(); - GenericOptionsParser optionParser = new GenericOptionsParser(conf, args); - String[] remainingArgs = optionParser.getRemainingArgs(); -``` - - - -If it is a Spark task --queue mode specifies the queue - - - -## Q : Master or Worker reports the following alarm - -

- -

- - - -A : Change the value of master.properties **master.reserved.memory** under conf to a smaller value, say 0.1 or the value of worker.properties **worker.reserved.memory** is a smaller value, say 0.1 - -## Q: The hive version is 1.1.0+cdh5.15.0, and the SQL hive task connection is reported incorrectly. - -

- -

- - -A : Will hive pom - -``` - - org.apache.hive - hive-jdbc - 2.1.0 - -``` - -change into - -``` - - org.apache.hive - hive-jdbc - 1.1.0 - -``` - diff --git a/docs/en_US/README.md b/docs/en_US/README.md deleted file mode 100644 index 05380d0212..0000000000 --- a/docs/en_US/README.md +++ /dev/null @@ -1,96 +0,0 @@ -Easy Scheduler -============ -[![License](https://img.shields.io/badge/license-Apache%202-4EB1BA.svg)](https://www.apache.org/licenses/LICENSE-2.0.html) -[![Total Lines](https://tokei.rs/b1/github/analysys/EasyScheduler?category=lines)](https://github.com/analysys/EasyScheduler) - -> Easy Scheduler for Big Data - - -[![Stargazers over time](https://starchart.cc/analysys/EasyScheduler.svg)](https://starchart.cc/analysys/EasyScheduler) - -[![EN doc](https://img.shields.io/badge/document-English-blue.svg)](README.md) -[![CN doc](https://img.shields.io/badge/文档-中文版-blue.svg)](README_zh_CN.md) - - -### Design features: - -A distributed and easy-to-expand visual DAG workflow scheduling system. Dedicated to solving the complex dependencies in data processing, making the scheduling system `out of the box` for data processing. -Its main objectives are as follows: - - - Associate the Tasks according to the dependencies of the tasks in a DAG graph, which can visualize the running state of task in real time. - - Support for many task types: Shell, MR, Spark, SQL (mysql, postgresql, hive, sparksql), Python, Sub_Process, Procedure, etc. - - Support process scheduling, dependency scheduling, manual scheduling, manual pause/stop/recovery, support for failed retry/alarm, recovery from specified nodes, Kill task, etc. - - Support process priority, task priority and task failover and task timeout alarm/failure - - Support process global parameters and node custom parameter settings - - Support online upload/download of resource files, management, etc. Support online file creation and editing - - Support task log online viewing and scrolling, online download log, etc. - - Implement cluster HA, decentralize Master cluster and Worker cluster through Zookeeper - - Support online viewing of `Master/Worker` cpu load, memory - - Support process running history tree/gantt chart display, support task status statistics, process status statistics - - Support backfilling data - - Support multi-tenant - - Support internationalization - - There are more waiting partners to explore - - -### What's in Easy Scheduler - - Stability | Easy to use | Features | Scalability | - -- | -- | -- | -- -Decentralized multi-master and multi-worker | Visualization process defines key information such as task status, task type, retry times, task running machine, visual variables and so on at a glance.  |  Support pause, recover operation | support custom task types -HA is supported by itself | All process definition operations are visualized, dragging tasks to draw DAGs, configuring data sources and resources. At the same time, for third-party systems, the api mode operation is provided. | Users on easyscheduler can achieve many-to-one or one-to-one mapping relationship through tenants and Hadoop users, which is very important for scheduling large data jobs. | The scheduler uses distributed scheduling, and the overall scheduling capability will increase linearly with the scale of the cluster. Master and Worker support dynamic online and offline. -Overload processing: Task queue mechanism, the number of schedulable tasks on a single machine can be flexibly configured, when too many tasks will be cached in the task queue, will not cause machine jam. | One-click deployment | Supports traditional shell tasks, and also support big data platform task scheduling: MR, Spark, SQL (mysql, postgresql, hive, sparksql), Python, Procedure, Sub_Process | | - - - - -### System partial screenshot - -![image](https://user-images.githubusercontent.com/48329107/61368744-1f5f3b00-a8c1-11e9-9cf1-10f8557a6b3b.png) - -![image](https://user-images.githubusercontent.com/48329107/61368966-9dbbdd00-a8c1-11e9-8dcc-a9469d33583e.png) - -![image](https://user-images.githubusercontent.com/48329107/61372146-f347b800-a8c8-11e9-8882-66e8934ada23.png) - - -### Document - --
Backend deployment documentation - -- Front-end deployment documentation - -- [**User manual**](https://analysys.github.io/easyscheduler_docs/system-manual.html?_blank "User manual") - -- [**Upgrade document**](https://analysys.github.io/easyscheduler_docs/upgrade.html?_blank "Upgrade document") - -- Online Demo - -More documentation please refer to [EasyScheduler online documentation] - -### Recent R&D plan -Work plan of Easy Scheduler: [R&D plan](https://github.com/analysys/EasyScheduler/projects/1), where `In Develop` card is the features of 1.1.0 version , TODO card is to be done (including feature ideas) - -### How to contribute code - -Welcome to participate in contributing code, please refer to the process of submitting the code: -[[How to contribute code](https://github.com/analysys/EasyScheduler/issues/310)] - -### Thanks - -Easy Scheduler uses a lot of excellent open source projects, such as google guava, guice, grpc, netty, ali bonecp, quartz, and many open source projects of apache, etc. -It is because of the shoulders of these open source projects that the birth of the Easy Scheduler is possible. We are very grateful for all the open source software used! We also hope that we will not only be the beneficiaries of open source, but also be open source contributors, so we decided to contribute to easy scheduling and promised long-term updates. We also hope that partners who have the same passion and conviction for open source will join in and contribute to open source! - -### Get Help -The fastest way to get response from our developers is to submit issues, or add our wechat : 510570367 - -### License -Please refer to [LICENSE](https://github.com/analysys/EasyScheduler/blob/dev/LICENSE) file. - - - - - - - - - diff --git a/docs/en_US/SUMMARY.md b/docs/en_US/SUMMARY.md deleted file mode 100644 index 397a4a110c..0000000000 --- a/docs/en_US/SUMMARY.md +++ /dev/null @@ -1,50 +0,0 @@ -# Summary - -* [Instruction](README.md) - -* Frontend Deployment - * [Preparations](frontend-deployment.md#Preparations) - * [Deployment](frontend-deployment.md#Deployment) - * [FAQ](frontend-deployment.md#FAQ) - -* Backend Deployment - * [Preparations](backend-deployment.md#Preparations) - * [Deployment](backend-deployment.md#Deployment) - -* [Quick Start](quick-start.md#Quick Start) - -* System Use Manual - * [Operational Guidelines](system-manual.md#Operational Guidelines) - * [Security](system-manual.md#Security) - * [Monitor center](system-manual.md#Monitor center) - * [Task Node Type and Parameter Setting](system-manual.md#Task Node Type and Parameter Setting) - * [System parameter](system-manual.md#System parameter) - -* [Architecture Design](architecture-design.md) - -* Front-end development - * [Development environment](frontend-development.md#Development environment) - * [Project directory structure](frontend-development.md#Project directory structure) - * [System function module](frontend-development.md#System function module) - * [Routing and state management](frontend-development.md#Routing and state management) - * [specification](frontend-development.md#specification) - * [interface](frontend-development.md#interface) - * [Extended development](frontend-development.md#Extended development) - -* Backend development documentation - * [Environmental requirements](backend-development.md#Environmental requirements) - * [Project compilation](backend-development.md#Project compilation) -* [Interface documentation](http://52.82.13.76:8888/escheduler/doc.html?language=en_US&lang=en) -* FAQ - * [FAQ](EasyScheduler-FAQ.md) -* EasyScheduler upgrade documentation - * [upgrade documentation](upgrade.md) -* History release notes - * [1.1.0 release](1.1.0-release.md) - * [1.0.5 release](1.0.5-release.md) - * [1.0.4 release](1.0.4-release.md) - * [1.0.3 release](1.0.3-release.md) - * [1.0.2 release](1.0.2-release.md) - * [1.0.1 release](1.0.1-release.md) - * [1.0.0 release] - diff --git a/docs/en_US/architecture-design.md b/docs/en_US/architecture-design.md deleted file mode 100644 index 0587993d05..0000000000 --- a/docs/en_US/architecture-design.md +++ /dev/null @@ -1,316 +0,0 @@ -## Architecture Design -Before explaining the architecture of the schedule system, let us first understand the common nouns of the schedule system. - -### 1.Noun Interpretation - -**DAG:** Full name Directed Acyclic Graph,referred to as DAG。Tasks in the workflow are assembled in the form of directed acyclic graphs, which are topologically traversed from nodes with zero indegrees of ingress until there are no successor nodes. For example, the following picture: - -

- dag示例 -

- dag example -

-

- -**Process definition**: Visualization **DAG** by dragging task nodes and establishing associations of task nodes - -**Process instance**: A process instance is an instantiation of a process definition, which can be generated by manual startup or scheduling. The process definition runs once, a new process instance is generated - -**Task instance**: A task instance is the instantiation of a specific task node when a process instance runs, which indicates the specific task execution status - -**Task type**: Currently supports SHELL, SQL, SUB_PROCESS (sub-process), PROCEDURE, MR, SPARK, PYTHON, DEPENDENT (dependency), and plans to support dynamic plug-in extension, note: the sub-**SUB_PROCESS** is also A separate process definition that can be launched separately - -**Schedule mode** : The system supports timing schedule and manual schedule based on cron expressions. Command type support: start workflow, start execution from current node, resume fault-tolerant workflow, resume pause process, start execution from failed node, complement, timer, rerun, pause, stop, resume waiting thread. Where **recovers the fault-tolerant workflow** and **restores the waiting thread** The two command types are used by the scheduling internal control and cannot be called externally - -**Timed schedule**: The system uses **quartz** distributed scheduler and supports the generation of cron expression visualization - -**Dependency**: The system does not only support **DAG** Simple dependencies between predecessors and successor nodes, but also provides **task dependencies** nodes, support for custom task dependencies between processes** - -**Priority**: Supports the priority of process instances and task instances. If the process instance and task instance priority are not set, the default is first in, first out. - -**Mail Alert**: Support **SQL Task** Query Result Email Send, Process Instance Run Result Email Alert and Fault Tolerant Alert Notification - -**Failure policy**: For tasks running in parallel, if there are tasks that fail, two failure policy processing methods are provided. **Continue** means that the status of the task is run in parallel until the end of the process failure. **End** means that once a failed task is found, Kill also drops the running parallel task and the process ends. - -**Complement**: Complement historical data, support ** interval parallel and serial ** two complement methods - - - -### 2.System architecture - -#### 2.1 System Architecture Diagram -

- System Architecture Diagram -

- System Architecture Diagram -

-

- - - -#### 2.2 Architectural description - -* **MasterServer** - - MasterServer adopts the distributed non-central design concept. MasterServer is mainly responsible for DAG task split, task submission monitoring, and monitoring the health status of other MasterServer and WorkerServer. - When the MasterServer service starts, it registers a temporary node with Zookeeper, and listens to the Zookeeper temporary node state change for fault tolerance processing. - - - - ##### The service mainly contains: - - - **Distributed Quartz** distributed scheduling component, mainly responsible for the start and stop operation of the scheduled task. When the quartz picks up the task, the master internally has a thread pool to be responsible for the subsequent operations of the task. - - - **MasterSchedulerThread** is a scan thread that periodically scans the **command** table in the database for different business operations based on different ** command types** - - - **MasterExecThread** is mainly responsible for DAG task segmentation, task submission monitoring, logic processing of various command types - - - **MasterTaskExecThread** is mainly responsible for task persistence - - - -* **WorkerServer** - - - WorkerServer also adopts a distributed, non-central design concept. WorkerServer is mainly responsible for task execution and providing log services. When the WorkerServer service starts, it registers the temporary node with Zookeeper and maintains the heartbeat. - - ##### This service contains: - - - **FetchTaskThread** is mainly responsible for continuously receiving tasks from **Task Queue** and calling **TaskScheduleThread** corresponding executors according to different task types. - - **LoggerServer** is an RPC service that provides functions such as log fragment viewing, refresh and download. - - - **ZooKeeper** - - The ZooKeeper service, the MasterServer and the WorkerServer nodes in the system all use the ZooKeeper for cluster management and fault tolerance. In addition, the system also performs event monitoring and distributed locking based on ZooKeeper. - We have also implemented queues based on Redis, but we hope that EasyScheduler relies on as few components as possible, so we finally removed the Redis implementation. - - - **Task Queue** - - The task queue operation is provided. Currently, the queue is also implemented based on Zookeeper. Since there is less information stored in the queue, there is no need to worry about too much data in the queue. In fact, we have over-measured a million-level data storage queue, which has no effect on system stability and performance. - - - **Alert** - - Provides alarm-related interfaces. The interfaces mainly include **Alarms**. The storage, query, and notification functions of the two types of alarm data. The notification function has two types: **mail notification** and **SNMP (not yet implemented)**. - - - **API** - - The API interface layer is mainly responsible for processing requests from the front-end UI layer. The service provides a RESTful api to provide request services externally. - Interfaces include workflow creation, definition, query, modification, release, offline, manual start, stop, pause, resume, start execution from this node, and more. - - - **UI** - - The front-end page of the system provides various visual operation interfaces of the system. For details, see the **[System User Manual] (System User Manual.md)** section. - - - -#### 2.3 Architectural Design Ideas - -##### I. Decentralized vs centralization - -###### Centralization Thought - -The centralized design concept is relatively simple. The nodes in the distributed cluster are divided into two roles according to their roles: - -

- master-slave role -

- -- The role of Master is mainly responsible for task distribution and supervising the health status of Slave. It can dynamically balance the task to Slave, so that the Slave node will not be "busy" or "free". -- The role of the Worker is mainly responsible for the execution of the task and maintains the heartbeat with the Master so that the Master can assign tasks to the Slave. - -Problems in the design of centralized : - -- Once the Master has a problem, the group has no leader and the entire cluster will crash. In order to solve this problem, most Master/Slave architecture modes adopt the design scheme of the master and backup masters, which can be hot standby or cold standby, automatic switching or manual switching, and more and more new systems are available. Automatically elects the ability to switch masters to improve system availability. -- Another problem is that if the Scheduler is on the Master, although it can support different tasks in one DAG running on different machines, it will generate overload of the Master. If the Scheduler is on the Slave, all tasks in a DAG can only be submitted on one machine. If there are more parallel tasks, the pressure on the Slave may be larger. - -###### Decentralization - -

-

- -- In the decentralized design, there is usually no Master/Slave concept, all roles are the same, the status is equal, the global Internet is a typical decentralized distributed system, networked arbitrary node equipment down machine , all will only affect a small range of features. -- The core design of decentralized design is that there is no "manager" that is different from other nodes in the entire distributed system, so there is no single point of failure problem. However, since there is no "manager" node, each node needs to communicate with other nodes to get the necessary machine information, and the unreliable line of distributed system communication greatly increases the difficulty of implementing the above functions. -- In fact, truly decentralized distributed systems are rare. Instead, dynamic centralized distributed systems are constantly emerging. Under this architecture, the managers in the cluster are dynamically selected, rather than preset, and when the cluster fails, the nodes of the cluster will spontaneously hold "meetings" to elect new "managers". Go to preside over the work. The most typical case is the Etcd implemented in ZooKeeper and Go. - -- Decentralization of EasyScheduler is the registration of Master/Worker to ZooKeeper. The Master Cluster and the Worker Cluster are not centered, and the Zookeeper distributed lock is used to elect one Master or Worker as the “manager” to perform the task. - -##### 二、Distributed lock practice - -EasyScheduler uses ZooKeeper distributed locks to implement only one Master to execute the Scheduler at the same time, or only one Worker to perform task submission. - -1. The core process algorithm for obtaining distributed locks is as follows - -

- Get Distributed Lock Process -

- -2. Scheduler thread distributed lock implementation flow chart in EasyScheduler: - -

- Get Distributed Lock Process -

- -##### Third, the thread is insufficient loop waiting problem - -- If there is no subprocess in a DAG, if the number of data in the Command is greater than the threshold set by the thread pool, the direct process waits or fails. -- If a large number of sub-processes are nested in a large DAG, the following figure will result in a "dead" state: - -

- Thread is not enough to wait for loop -

- -In the above figure, MainFlowThread waits for SubFlowThread1 to end, SubFlowThread1 waits for SubFlowThread2 to end, SubFlowThread2 waits for SubFlowThread3 to end, and SubFlowThread3 waits for a new thread in the thread pool, then the entire DAG process cannot end, and thus the thread cannot be released. This forms the state of the child parent process loop waiting. At this point, the scheduling cluster will no longer be available unless a new Master is started to add threads to break such a "stuck." - -It seems a bit unsatisfactory to start a new Master to break the deadlock, so we proposed the following three options to reduce this risk: - -1. Calculate the sum of the threads of all Masters, and then calculate the number of threads required for each DAG, that is, pre-calculate before the DAG process is executed. Because it is a multi-master thread pool, the total number of threads is unlikely to be obtained in real time. -2. Judge the single master thread pool. If the thread pool is full, let the thread fail directly. -3. Add a Command type with insufficient resources. If the thread pool is insufficient, the main process will be suspended. This way, the thread pool has a new thread, which can make the process with insufficient resources hang up and wake up again. - -Note: The Master Scheduler thread is FIFO-enabled when it gets the Command. - -So we chose the third way to solve the problem of insufficient threads. - -##### IV. Fault Tolerant Design - -Fault tolerance is divided into service fault tolerance and task retry. Service fault tolerance is divided into two types: Master Fault Tolerance and Worker Fault Tolerance. - -###### 1. Downtime fault tolerance - -Service fault tolerance design relies on ZooKeeper's Watcher mechanism. The implementation principle is as follows: - -

- EasyScheduler Fault Tolerant Design -

- -The Master monitors the directories of other Masters and Workers. If the remove event is detected, the process instance is fault-tolerant or the task instance is fault-tolerant according to the specific business logic. - - - -- Master fault tolerance flow chart: - -

- Master Fault Tolerance Flowchart -

- -After the ZooKeeper Master is fault-tolerant, it is rescheduled by the Scheduler thread in EasyScheduler. It traverses the DAG to find the "Running" and "Submit Successful" tasks, and monitors the status of its task instance for the "Running" task. You need to determine whether the Task Queue already exists. If it exists, monitor the status of the task instance. If it does not exist, resubmit the task instance. - - - -- Worker fault tolerance flow chart: - -

- Worker Fault Tolerance Flowchart -

- -Once the Master Scheduler thread finds the task instance as "need to be fault tolerant", it takes over the task and resubmits. - - Note: Because the "network jitter" may cause the node to lose the heartbeat of ZooKeeper in a short time, the node's remove event occurs. In this case, we use the easiest way, that is, once the node has timeout connection with ZooKeeper, it will directly stop the Master or Worker service. - -###### 2. Task failure retry - -Here we must first distinguish between the concept of task failure retry, process failure recovery, and process failure rerun: - -- Task failure Retry is task level, which is automatically performed by the scheduling system. For example, if a shell task sets the number of retries to 3 times, then the shell task will try to run up to 3 times after failing to run. -- Process failure recovery is process level, is done manually, recovery can only be performed from the failed node ** or ** from the current node ** -- Process failure rerun is also process level, is done manually, rerun is from the start node - - - -Next, let's talk about the topic, we divided the task nodes in the workflow into two types. - -- One is a business node, which corresponds to an actual script or processing statement, such as a Shell node, an MR node, a Spark node, a dependent node, and so on. -- There is also a logical node, which does not do the actual script or statement processing, but the logical processing of the entire process flow, such as sub-flow sections. - -Each ** service node** can configure the number of failed retries. When the task node fails, it will automatically retry until it succeeds or exceeds the configured number of retries. **Logical node** does not support failed retry. But the tasks in the logical nodes support retry. - -If there is a task failure in the workflow that reaches the maximum number of retries, the workflow will fail to stop, and the failed workflow can be manually rerun or process resumed. - - - -##### V. Task priority design - -In the early scheduling design, if there is no priority design and fair scheduling design, it will encounter the situation that the task submitted first may be completed simultaneously with the task submitted subsequently, but the priority of the process or task cannot be set. We have redesigned this, and we are currently designing it as follows: - -- According to ** different process instance priority ** prioritizes ** same process instance priority ** prioritizes ** task priority within the same process ** takes precedence over ** same process ** commit order from high Go to low for task processing. - - - The specific implementation is to resolve the priority according to the json of the task instance, and then save the ** process instance priority _ process instance id_task priority _ task id** information in the ZooKeeper task queue, when obtained from the task queue, Through string comparison, you can get the task that needs to be executed first. - - - The priority of the process definition is that some processes need to be processed before other processes. This can be configured at the start of the process or at the time of scheduled start. There are 5 levels, followed by HIGHEST, HIGH, MEDIUM, LOW, and LOWEST. As shown below - -

- Process Priority Configuration -

- - - The priority of the task is also divided into 5 levels, followed by HIGHEST, HIGH, MEDIUM, LOW, and LOWEST. As shown below - -

- task priority configuration -

- -##### VI. Logback and gRPC implement log access - -- Since the Web (UI) and Worker are not necessarily on the same machine, viewing the log is not as it is for querying local files. There are two options: - - Put the logs on the ES search engine - - Obtain remote log information through gRPC communication -- Considering the lightweightness of EasyScheduler as much as possible, gRPC was chosen to implement remote access log information. - -

- grpc remote access -

- -- We use a custom Logback FileAppender and Filter function to generate a log file for each task instance. -- The main implementation of FileAppender is as follows: - -```java - /** - * task log appender - */ - Public class TaskLogAppender extends FileAppender { - - ... - - @Override - Protected void append(ILoggingEvent event) { - - If (currentlyActiveFile == null){ - currentlyActiveFile = getFile(); - } - String activeFile = currentlyActiveFile; - // thread name: taskThreadName-processDefineId_processInstanceId_taskInstanceId - String threadName = event.getThreadName(); - String[] threadNameArr = threadName.split("-"); - // logId = processDefineId_processInstanceId_taskInstanceId - String logId = threadNameArr[1]; - ... - super.subAppend(event); - } -} -``` - -Generate a log in the form of /process definition id/process instance id/task instance id.log - -- Filter matches the thread name starting with TaskLogInfo: -- TaskLogFilter is implemented as follows: - -```java - /** - * task log filter - */ -Public class TaskLogFilter extends Filter { - - @Override - Public FilterReply decide(ILoggingEvent event) { - If (event.getThreadName().startsWith("TaskLogInfo-")){ - Return FilterReply.ACCEPT; - } - Return FilterReply.DENY; - } -} -``` - - - -### summary - -Starting from the scheduling, this paper introduces the architecture principle and implementation ideas of the big data distributed workflow scheduling system-EasyScheduler. To be continued diff --git a/docs/en_US/backend-deployment.md b/docs/en_US/backend-deployment.md deleted file mode 100644 index 934a005f6b..0000000000 --- a/docs/en_US/backend-deployment.md +++ /dev/null @@ -1,207 +0,0 @@ -# Backend Deployment Document - -There are two deployment modes for the backend: - -- automatic deployment -- source code compile and then deployment - -## Preparations - -Download the latest version of the installation package, download address: [gitee download](https://gitee.com/easyscheduler/EasyScheduler/attach_files/) or [github download](https://github.com/analysys/EasyScheduler/releases), download escheduler-backend-x.x.x.tar.gz(back-end referred to as escheduler-backend),escheduler-ui-x.x.x.tar.gz(front-end referred to as escheduler-ui) - - - -#### Preparations 1: Installation of basic software (self-installation of required items) - - * [Mysql](http://geek.analysys.cn/topic/124) (5.5+) : Mandatory - * [JDK](https://www.oracle.com/technetwork/java/javase/downloads/index.html) (1.8+) : Mandatory - * [ZooKeeper](https://www.jianshu.com/p/de90172ea680)(3.4.6+) :Mandatory - * [Hadoop](https://blog.csdn.net/Evankaka/article/details/51612437)(2.6+) :Optionally, if you need to use the resource upload function, MapReduce task submission needs to configure Hadoop (uploaded resource files are currently stored on Hdfs) - * [Hive](https://staroon.pro/2017/12/09/HiveInstall/)(1.2.1) : Optional, hive task submission needs to be installed - * Spark(1.x,2.x) : Optional, Spark task submission needs to be installed - * PostgreSQL(8.2.15+) : Optional, PostgreSQL PostgreSQL stored procedures need to be installed - -``` - Note: Easy Scheduler itself does not rely on Hadoop, Hive, Spark, PostgreSQL, but only calls their Client to run the corresponding tasks. -``` - -#### Preparations 2: Create deployment users - -- Deployment users are created on all machines that require deployment scheduling, because the worker service executes jobs in `sudo-u {linux-user}`, so deployment users need sudo privileges and are confidential. - -``` -vi /etc/sudoers - -# For example, the deployment user is an escheduler account -escheduler ALL=(ALL) NOPASSWD: NOPASSWD: ALL - -# And you need to comment out the Default requiretty line -#Default requiretty -``` - -#### Preparations 3: SSH Secret-Free Configuration -Configure SSH secret-free login on deployment machines and other installation machines. If you want to install easyscheduler on deployment machines, you need to configure native password-free login itself. - -- [Connect the host and other machines SSH](http://geek.analysys.cn/topic/113) - -#### Preparations 4: database initialization - -* Create databases and accounts - - Execute the following command to create database and account - - ``` - CREATE DATABASE escheduler DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci; - GRANT ALL PRIVILEGES ON escheduler.* TO '{user}'@'%' IDENTIFIED BY '{password}'; - GRANT ALL PRIVILEGES ON escheduler.* TO '{user}'@'localhost' IDENTIFIED BY '{password}'; - flush privileges; - ``` - -* creates tables and imports basic data - Modify the following attributes in ./conf/dao/data_source.properties - - ``` - spring.datasource.url - spring.datasource.username - spring.datasource.password - ``` - - Execute scripts for creating tables and importing basic data - - ``` - sh ./script/create-escheduler.sh - ``` - -#### Preparations 5: Modify the deployment directory permissions and operation parameters - - instruction of escheduler-backend directory - -```directory -bin : Basic service startup script -conf : Project Profile -lib : The project relies on jar packages, including individual module jars and third-party jars -script : Cluster Start, Stop and Service Monitor Start and Stop scripts -sql : The project relies on SQL files -install.sh : One-click deployment script -``` - -- Modify permissions (please modify the 'deployUser' to the corresponding deployment user) so that the deployment user has operational privileges on the escheduler-backend directory - - `sudo chown -R deployUser:deployUser escheduler-backend` - -- Modify the `.escheduler_env.sh` environment variable in the conf/env/directory - -- Modify deployment parameters (depending on your server and business situation): - - - Modify the parameters in **install.sh** to replace the values required by your business - - MonitorServerState switch variable, added in version 1.0.3, controls whether to start the self-start script (monitor master, worker status, if off-line will start automatically). The default value of "false" means that the self-start script is not started, and if it needs to start, it is changed to "true". - - 'hdfsStartupSate' switch variable controls whether to start hdfs - The default value of "false" means not to start hdfs - Change the variable to 'true' if you want to use hdfs, you also need to create the hdfs root path by yourself, that 'hdfsPath' in install.sh. - - - If you use hdfs-related functions, you need to copy**hdfs-site.xml** and **core-site.xml** to the conf directory - - -## Deployment -Automated deployment is recommended, and experienced partners can use source deployment as well. - -### Automated Deployment - -- Install zookeeper tools - - `pip install kazoo` - -- Switch to deployment user, one-click deployment - - `sh install.sh` - -- Use the `jps` command to check if the services are started (`jps` comes from `Java JDK`) - -```aidl - MasterServer ----- Master Service - WorkerServer ----- Worker Service - LoggerServer ----- Logger Service - ApiApplicationServer ----- API Service - AlertServer ----- Alert Service -``` - -If all services are normal, the automatic deployment is successful - - -After successful deployment, the log can be viewed and stored in a specified folder. - -```logPath - logs/ - ├── escheduler-alert-server.log - ├── escheduler-master-server.log - |—— escheduler-worker-server.log - |—— escheduler-api-server.log - |—— escheduler-logger-server.log -``` - -### Compile source code to deploy - -After downloading the release version of the source package, unzip it into the root directory - -* Execute the compilation command: - -``` - mvn -U clean package assembly:assembly -Dmaven.test.skip=true -``` - -* View directory - -After normal compilation, ./target/escheduler-{version}/ is generated in the current directory - - -### Start-and-stop services commonly used in systems (for service purposes, please refer to System Architecture Design for details) - -* stop all services in the cluster - - ` sh ./bin/stop-all.sh` - -* start all services in the cluster - - ` sh ./bin/start-all.sh` - -* start and stop one master server - -```master -sh ./bin/escheduler-daemon.sh start master-server -sh ./bin/escheduler-daemon.sh stop master-server -``` - -* start and stop one worker server - -```worker -sh ./bin/escheduler-daemon.sh start worker-server -sh ./bin/escheduler-daemon.sh stop worker-server -``` - -* start and stop api server - -```Api -sh ./bin/escheduler-daemon.sh start api-server -sh ./bin/escheduler-daemon.sh stop api-server -``` -* start and stop logger server - -```Logger -sh ./bin/escheduler-daemon.sh start logger-server -sh ./bin/escheduler-daemon.sh stop logger-server -``` -* start and stop alert server - -```Alert -sh ./bin/escheduler-daemon.sh start alert-server -sh ./bin/escheduler-daemon.sh stop alert-server -``` - -## Database Upgrade -Database upgrade is a function added in version 1.0.2. The database can be upgraded automatically by executing the following command: - -```upgrade -sh ./script/upgrade-escheduler.sh -``` - - diff --git a/docs/en_US/backend-development.md b/docs/en_US/backend-development.md deleted file mode 100644 index 10f7ba47f6..0000000000 --- a/docs/en_US/backend-development.md +++ /dev/null @@ -1,48 +0,0 @@ -# Backend development documentation - -## Environmental requirements - - * [Mysql](http://geek.analysys.cn/topic/124) (5.5+) : Must be installed - * [JDK](https://www.oracle.com/technetwork/java/javase/downloads/index.html) (1.8+) : Must be installed - * [ZooKeeper](https://mirrors.tuna.tsinghua.edu.cn/apache/zookeeper)(3.4.6+) :Must be installed - * [Maven](http://maven.apache.org/download.cgi)(3.3+) :Must be installed - -Because the escheduler-rpc module in EasyScheduler uses Grpc, you need to use Maven to compile the generated classes. -For those who are not familiar with maven, please refer to: [maven in five minutes](http://maven.apache.org/guides/getting-started/maven-in-five-minutes.html)(3.3+) - -http://maven.apache.org/install.html - -## Project compilation -After importing the EasyScheduler source code into the development tools such as Idea, first convert to the Maven project (right click and select "Add Framework Support") - -* Execute the compile command: - -``` - mvn -U clean package assembly:assembly -Dmaven.test.skip=true -``` - -* View directory - -After normal compilation, it will generate ./target/escheduler-{version}/ in the current directory. - -``` - bin - conf - lib - script - sql - install.sh -``` - -- Description - -``` -bin : basic service startup script -conf : project configuration file -lib : the project depends on the jar package, including the various module jars and third-party jars -script : cluster start, stop, and service monitoring start and stop scripts -sql : project depends on sql file -install.sh : one-click deployment script -``` - - diff --git a/docs/en_US/book.json b/docs/en_US/book.json deleted file mode 100644 index c05811289d..0000000000 --- a/docs/en_US/book.json +++ /dev/null @@ -1,23 +0,0 @@ -{ - "title": "EasyScheduler", - "author": "", - "description": "Scheduler", - "language": "en-US", - "gitbook": "3.2.3", - "styles": { - "website": "./styles/website.css" - }, - "structure": { - "readme": "README.md" - }, - "plugins":[ - "expandable-chapters", - "insert-logo-link" - ], - "pluginsConfig": { - "insert-logo-link": { - "src": "http://geek.analysys.cn/static/upload/236/2019-03-29/379450b4-7919-4707-877c-4d33300377d4.png", - "url": "https://github.com/analysys/EasyScheduler" - } - } -} \ No newline at end of file diff --git a/docs/en_US/frontend-deployment.md b/docs/en_US/frontend-deployment.md deleted file mode 100644 index 919caf1485..0000000000 --- a/docs/en_US/frontend-deployment.md +++ /dev/null @@ -1,115 +0,0 @@ -# frontend-deployment - -The front-end has three deployment modes: automated deployment, manual deployment and compiled source deployment. - - - -## Preparations - -#### Download the installation package - -Please download the latest version of the installation package, download address: [gitee](https://gitee.com/easyscheduler/EasyScheduler/attach_files/) - -After downloading escheduler-ui-x.x.x.tar.gz,decompress`tar -zxvf escheduler-ui-x.x.x.tar.gz ./`and enter the`escheduler-ui`directory - - - - -## Deployment - -Automated deployment is recommended for either of the following two ways - -### Automated Deployment - -Edit the installation file`vi install-escheduler-ui.sh` in the` escheduler-ui` directory - -Change the front-end access port and the back-end proxy interface address - -``` -# Configure the front-end access port -esc_proxy="8888" - -# Configure proxy back-end interface -esc_proxy_port="http://192.168.xx.xx:12345" -``` - ->Front-end automatic deployment based on Linux system `yum` operation, before deployment, please install and update`yum` - -under this directory, execute`./install-escheduler-ui.sh` - - -### Manual Deployment - -Install epel source `yum install epel-release -y` - -Install Nginx `yum install nginx -y` - - -> #### Nginx configuration file address - -``` -/etc/nginx/conf.d/default.conf -``` - -> #### Configuration information (self-modifying) - -``` -server { - listen 8888;# access port - server_name localhost; - #charset koi8-r; - #access_log /var/log/nginx/host.access.log main; - location / { - root /xx/dist; # the dist directory address decompressed by the front end above (self-modifying) - index index.html index.html; - } - location /escheduler { - proxy_pass http://192.168.xx.xx:12345; # interface address (self-modifying) - proxy_set_header Host $host; - proxy_set_header X-Real-IP $remote_addr; - proxy_set_header x_real_ipP $remote_addr; - proxy_set_header remote_addr $remote_addr; - proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; - proxy_http_version 1.1; - proxy_connect_timeout 4s; - proxy_read_timeout 30s; - proxy_send_timeout 12s; - proxy_set_header Upgrade $http_upgrade; - proxy_set_header Connection "upgrade"; - } - #error_page 404 /404.html; - # redirect server error pages to the static page /50x.html - # - error_page 500 502 503 504 /50x.html; - location = /50x.html { - root /usr/share/nginx/html; - } -} -``` - -> #### Restart the Nginx service - -``` -systemctl restart nginx -``` - -#### nginx command - -- enable `systemctl enable nginx` - -- restart `systemctl restart nginx` - -- status `systemctl status nginx` - - -## FAQ -#### Upload file size limit - -Edit the configuration file `vi /etc/nginx/nginx.conf` - -``` -# change upload size -client_max_body_size 1024m -``` - - diff --git a/docs/en_US/frontend-development.md b/docs/en_US/frontend-development.md deleted file mode 100644 index 286c598dbc..0000000000 --- a/docs/en_US/frontend-development.md +++ /dev/null @@ -1,650 +0,0 @@ -# Front-end development documentation - -### Technical selection -``` -Vue mvvm framework - -Es6 ECMAScript 6.0 - -Ans-ui Analysys-ui - -D3 Visual Library Chart Library - -Jsplumb connection plugin library - -Lodash high performance JavaScript utility library -``` - - -### Development environment - -- #### Node installation -Node package download (note version 8.9.4) `https://nodejs.org/download/release/v8.9.4/` - - -- #### Front-end project construction -Use the command line mode `cd` enter the `escheduler-ui` project directory and execute `npm install` to pull the project dependency package. - -> If `npm install` is very slow - -> You can enter the Taobao image command line to enter `npm install -g cnpm --registry=https://registry.npm.taobao.org` - -> Run `cnpm install` - - -- Create a new `.env` file or the interface that interacts with the backend - -Create a new` .env` file in the `escheduler-ui `directory, add the ip address and port of the backend service to the file, and use it to interact with the backend. The contents of the` .env` file are as follows: -``` -# Proxy interface address (modified by yourself) -API_BASE = http://192.168.xx.xx:12345 - -# If you need to access the project with ip, you can remove the "#" (example) -#DEV_HOST = 192.168.xx.xx -``` - -> ##### ! ! ! Special attention here. If the project reports a "node-sass error" error while pulling the dependency package, execute the following command again after execution. -``` -npm install node-sass --unsafe-perm //单独安装node-sass依赖 -``` - -- #### Development environment operation -- `npm start` project development environment (after startup address http://localhost:8888/#/) - - -#### Front-end project release - -- `npm run build` project packaging (after packaging, the root directory will create a folder called dist for publishing Nginx online) - -Run the `npm run build` command to generate a package file (dist) package - -Copy it to the corresponding directory of the server (front-end service static page storage directory) - -Visit address` http://localhost:8888/#/` - - -#### Start with node and daemon under Linux - -Install pm2 `npm install -g pm2` - -Execute `pm2 start npm -- run dev` to start the project in the project `escheduler-ui `root directory - -#### command - -- Start `pm2 start npm -- run dev` - -- Stop `pm2 stop npm` - -- delete `pm2 delete npm` - -- Status `pm2 list` - -``` - -[root@localhost escheduler-ui]# pm2 start npm -- run dev -[PM2] Applying action restartProcessId on app [npm](ids: 0) -[PM2] [npm](0) ✓ -[PM2] Process successfully started -┌──────────┬────┬─────────┬──────┬──────┬────────┬─────────┬────────┬─────┬──────────┬──────┬──────────┐ -│ App name │ id │ version │ mode │ pid │ status │ restart │ uptime │ cpu │ mem │ user │ watching │ -├──────────┼────┼─────────┼──────┼──────┼────────┼─────────┼────────┼─────┼──────────┼──────┼──────────┤ -│ npm │ 0 │ N/A │ fork │ 6168 │ online │ 31 │ 0s │ 0% │ 5.6 MB │ root │ disabled │ -└──────────┴────┴─────────┴──────┴──────┴────────┴─────────┴────────┴─────┴──────────┴──────┴──────────┘ - Use `pm2 show ` to get more details about an app - -``` - - -### Project directory structure - -`build` some webpack configurations for packaging and development environment projects - -`node_modules` development environment node dependency package - -`src` project required documents - -`src => combo` project third-party resource localization `npm run combo` specific view `build/combo.js` - -`src => font` Font icon library can be added by visiting https://www.iconfont.cn Note: The font library uses its own secondary development to reintroduce its own library `src/sass/common/_font.scss` - -`src => images` public image storage - -`src => js` js/vue - -`src => lib` internal components of the company (company component library can be deleted after open source) - -`src => sass` sass file One page corresponds to a sass file - -`src => view` page file One page corresponds to an html file - -``` -> Projects are developed using vue single page application (SPA) -- All page entry files are in the `src/js/conf/${ corresponding page filename => home} index.js` entry file -- The corresponding sass file is in `src/sass/conf/${corresponding page filename => home}/index.scss` -- The corresponding html file is in `src/view/${corresponding page filename => home}/index.html` -``` - -Public module and utill `src/js/module` - -`components` => internal project common components - -`download` => download component - -`echarts` => chart component - -`filter` => filter and vue pipeline - -`i18n` => internationalization - -`io` => io request encapsulation based on axios - -`mixin` => vue mixin public part for disabled operation - -`permissions` => permission operation - -`util` => tool - - -### System function module - -Home => `http://localhost:8888/#/home` - -Project Management => `http://localhost:8888/#/projects/list` -``` -| Project Home -| Workflow - - Workflow definition - - Workflow instance - - Task instance -``` - -Resource Management => `http://localhost:8888/#/resource/file` -``` -| File Management -| udf Management - - Resource Management - - Function management - - - -``` - -Data Source Management => `http://localhost:8888/#/datasource/list` - -Security Center => `http://localhost:8888/#/security/tenant` -``` -| Tenant Management -| User Management -| Alarm Group Management - - master - - worker -``` - -User Center => `http://localhost:8888/#/user/account` - - -## Routing and state management - -The project `src/js/conf/home` is divided into - -`pages` => route to page directory -``` - The page file corresponding to the routing address -``` - -`router` => route management -``` -vue router, the entry file index.js in each page will be registered. Specific operations: https://router.vuejs.org/zh/ -``` - -`store` => status management -``` -The page corresponding to each route has a state management file divided into: - -actions => mapActions => Details:https://vuex.vuejs.org/zh/guide/actions.html - -getters => mapGetters => Details:https://vuex.vuejs.org/zh/guide/getters.html - -index => entrance -mutations => mapMutations => Details:https://vuex.vuejs.org/zh/guide/mutations.html - -state => mapState => Details:https://vuex.vuejs.org/zh/guide/state.html - -Specific action:https://vuex.vuejs.org/zh/ - -``` - - -## specification -## Vue specification -##### 1.Component name -The component is named multiple words and is connected with a wire (-) to avoid conflicts with HTML tags and a clearer structure. -``` -// positive example -export default { - name: 'page-article-item' -} -``` - -##### 2.Component files -The internal common component of the `src/js/module/components` project writes the folder name with the same name as the file name. The subcomponents and util tools that are split inside the common component are placed in the internal `_source` folder of the component. -``` -└── components - ├── header - ├── header.vue - └── _source - └── nav.vue - └── util.js - ├── conditions - ├── conditions.vue - └── _source - └── search.vue - └── util.js -``` - -##### 3.Prop -When you define Prop, you should always name it in camel format (camelCase) and use the connection line (-) when assigning values to the parent component.This follows the characteristics of each language, because it is case-insensitive in HTML tags, and the use of links is more friendly; in JavaScript, the more natural is the hump name. - -``` -// Vue -props: { - articleStatus: Boolean -} -// HTML - -``` - -The definition of Prop should specify its type, defaults, and validation as much as possible. - -Example: - -``` -props: { - attrM: Number, - attrA: { - type: String, - required: true - }, - attrZ: { - type: Object, - // The default value of the array/object should be returned by a factory function - default: function () { - return { - msg: 'achieve you and me' - } - } - }, - attrE: { - type: String, - validator: function (v) { - return !(['success', 'fail'].indexOf(v) === -1) - } - } -} -``` - -##### 4.v-for -When performing v-for traversal, you should always bring a key value to make rendering more efficient when updating the DOM. -``` -
    -
  • - {{ item.title }} -
  • -
-``` - -v-for should be avoided on the same element as v-if (`for example:
  • `) because v-for has a higher priority than v-if. To avoid invalid calculations and rendering, you should try to use v-if Put it on top of the container's parent element. -``` -
      -
    • - {{ item.title }} -
    • -
    -``` - -##### 5.v-if / v-else-if / v-else -If the elements in the same set of v-if logic control are logically identical, Vue reuses the same part for more efficient element switching, `such as: value`. In order to avoid the unreasonable effect of multiplexing, you should add key to the same element for identification. -``` -
    - {{ mazeyData }} -
    -
    - no data -
    -``` - -##### 6.Instruction abbreviation -In order to unify the specification, the instruction abbreviation is always used. Using `v-bind`, `v-on` is not bad. Here is only a unified specification. -``` - -``` - -##### 7.Top-level element order of single file components -Styles are packaged in a file, all the styles defined in a single vue file, the same name in other files will also take effect. All will have a top class name before creating a component. -Note: The sass plugin has been added to the project, and the sas syntax can be written directly in a single vue file. -For uniformity and ease of reading, they should be placed in the order of `