Commit Graph

236 Commits

Author SHA1 Message Date
caishunfeng
9d9ae9ad54
[improvement] support self-dependent (#13818) 2023-03-29 18:18:03 +08:00
Wenjun Ruan
68660ec96b
Refactor remote command (#13809)
* Refactor remote command

* Rename Command to Message
2023-03-29 17:54:57 +08:00
Wenjun Ruan
d1b6e6f02c
Take over task instance in master failover (#13798) 2023-03-28 16:02:57 +08:00
Wenjun Ruan
d91bdeff37
Fix retry task instance will loss varpool (#13791) 2023-03-25 12:26:59 +08:00
Wenjun Ruan
1f365819a6
Ignore unknown VM options in start.sh (#13719) 2023-03-10 17:34:09 +08:00
Eric Gao
394805b2c7
[Feature][Metrics] Tag workflow related metrics with process definition code (workflow id) (#13640)
* Tag workflow related metrics with process definition code (workflow id)

* Clean up related metrics when deleting workflow definition

* Add license headers

* Update related UT cases

* Add an example in grafana-demo

* Add related docs
2023-03-09 11:30:21 +08:00
Wenjun Ruan
c9066e8de9
Use MDC to filter task instance log (#13673)
* Use MDC to collect task instance log
* Use MDCAutoClosableContext to remove the MDC key
2023-03-06 17:44:23 +08:00
fuchanghai
7bf3e3cdd6
[improve-#13201] update pid during running (#13206) 2023-02-22 14:22:49 +08:00
Aaron Wang
047fa2f65e
[Feature-13511] Submit Spark task directly on Kubernetes (#13550) 2023-02-21 23:26:21 +08:00
fuchanghai
701d67c831
[improve-#13045] after a submit failure, stop the processInstance to avoid an endless loop (#13051)
* [improve-#13045] add max submit number of workflow

* [fix-13405] remove max times

* Update dolphinscheduler-master/src/main/resources/application.yaml

Co-authored-by: Wenjun Ruan <wenjun@apache.org>

---------

Co-authored-by: fuchanghai <‘2875334588@qq.com’>
Co-authored-by: Wenjun Ruan <wenjun@apache.org>
2023-02-20 14:10:59 +08:00
Wenjun Ruan
e8b20def54
Fix task instance will generate multiple times when retry interval is 0/s (#13571) 2023-02-18 23:41:16 +08:00
Rick Cheng
2bd65fb2df
[Feature][Remote Logging] Add support for writing task logs to OSS (#13332) 2023-02-13 16:38:26 +08:00
seedscoder
8d12dc0702
[Improvement-13491] Use lombok @Slf4j annotation to generate logger (#13509) 2023-02-07 20:32:53 +08:00
Wenjun Ruan
12dd60fa46
Fix task wake up failed will block the event (#13466) 2023-02-01 15:31:04 +08:00
Eric Gao
385d781ebc
Fix minor spelling and punctuation errors (#13452) 2023-01-29 16:51:03 +08:00
Wenjun Ruan
ce34e21960
Task instance failure when worker group doesn't exist (#13448) 2023-01-29 16:27:56 +08:00
qianli2022
8be32d4145
[Feature][Api] When use api to run a process we want get processInstanceId (#13184)
* add sql

* add mapper

* add dao

* add excutor


Co-authored-by: qianl4 <qianl4@cicso.com>
2023-01-18 17:58:32 +08:00
hokie-chan
3b980cb06a
[fix][worker][bug] master/worker crash when registry recover from SUSPENDED to RECONNECTED (#13328) 2023-01-03 19:24:11 +08:00
Aaron Wang
ccad56e88e
[Improvement][Master] Validate same content of input file when using task cache (#13298)
* support file content checksum

* fix inject null storageOperate bug
2023-01-03 11:38:13 +08:00
Wenjun Ruan
8a479927f3
Add projectCode in t_ds_process_instance and t_ds_task_instance to remove join (#13284) 2023-01-03 09:52:28 +08:00
Wenjun Ruan
52134277a3
Fix task group cannot release when kill task (#13314) 2023-01-03 09:52:03 +08:00
JieguangZhou
2e95a020ab
fix dag.getPreviousNodes miss upstream node (#13255) 2022-12-22 15:51:40 +08:00
Wenjun Ruan
14ec4a2398
Remove dao module in worker (#13242) 2022-12-22 12:25:29 +08:00
John Bampton
5fe25c995f
Fix spelling (#13237) 2022-12-21 16:22:49 +08:00
ZhongJinHacker
d13cd55281
fix spell error and move comment to correct describe location (#13233) 2022-12-21 16:22:03 +08:00
JieguangZhou
66e20271ad
[Feature][Master] Add task caching mechanism to improve the running speed of repetitive tasks (#13194)
* Supports task instance cache operation

* add task plugin cache

* use SHA-256 to generate key

* Update dolphinscheduler-dao/src/main/resources/sql/dolphinscheduler_mysql.sql

Co-authored-by: Jay Chung <zhongjiajie955@gmail.com>

* Update dolphinscheduler-dao/src/main/resources/sql/dolphinscheduler_postgresql.sql

Co-authored-by: Jay Chung <zhongjiajie955@gmail.com>

* Optimizing database Scripts

* Optimize clear cache operation

Co-authored-by: Jay Chung <zhongjiajie955@gmail.com>
2022-12-18 18:17:09 +08:00
sssqhai
7a0a2c2a46
Solve the deadlock problem caused by queuing (#13191)
* Solve the deadlock problem caused by queuing

* Solve the deadlock problem caused by queuing

* Solve the deadlock problem caused by queuing

* Solve the deadlock problem caused by queuing,move the event to the tail by throwing a exception

Co-authored-by: wfs <wangfushun@cdqcp.cpm>
2022-12-16 19:55:02 +08:00
Wenjun Ruan
be81b222d4
Optimize event loop (#13193) 2022-12-15 14:34:52 +08:00
JieguangZhou
e4b9b67255
Allow execute task in workflow instance (#13103) 2022-12-13 16:43:44 +08:00
Wenjun Ruan
70ccffeee2
Format task parameter as pretty json (#13173) 2022-12-13 16:30:21 +08:00
Yann Ann
6ef74073cc
[Improve-13001]migrate commons-collections -> commons-collections4 (#13002) 2022-12-10 23:50:19 +08:00
JieguangZhou
a7ecc5a8b3
fix retry task failure (#13077) 2022-12-05 21:22:51 +08:00
Wenjun Ruan
169168ef34
Add plugin-all module (#13079) 2022-12-02 23:19:08 +08:00
Kevin.Shin
12a6138d33
fix issue 13035 (#13065)
Co-authored-by: shenk-b <shenk-b@glodon.com>
2022-12-01 13:58:50 +08:00
Wenjun Ruan
ffc9fb280a
Add gc timestampt (#13059) 2022-12-01 10:00:27 +08:00
Wenjun Ruan
1a8811cb41
Set max loop times when consume StateEvent to avoid dead loop influence the thread. (#13007) 2022-11-27 15:34:26 +08:00
Yann Ann
3106054ea7
[Improvement-12907] Change heartbeat log level to debug (#12980) 2022-11-25 17:37:30 +08:00
Kerwin
50779ea1e6
[Bug-12963] [Master] Fix dependent task node null pointer exception (#12965)
* Fix that there are both manual and scheduled workflow instances in dependent nodes, and one of them will report a null pointer exception during execution.
2022-11-24 19:00:46 +08:00
rickchengx
38b876733c
[Feature-10498] Mask the password in the log of sqoop task (#11589) 2022-11-24 14:54:54 +08:00
fuchanghai
3747029cc0
[fix-#12932] when subprocess's processInstance is fail,not notify parent processInstance (#12933) 2022-11-22 15:07:50 +08:00
John Bampton
27c37b8828
Fix grammar and spelling (#12937) 2022-11-18 23:03:34 +08:00
Kerwin
c916c60853
fix NPE while retry task (#12903) 2022-11-16 10:31:52 +08:00
Wenjun Ruan
d99ba29b66
Fix master cluster may loop command unbalanced (#12891)
(cherry picked from commit 3b2b86661be76b7c1404a910c865d78b7936313d)
2022-11-16 10:20:22 +08:00
JieguangZhou
229c554912
[feature][task] Add Kubeflow task plugin for MLOps scenario (#12843) 2022-11-11 16:08:38 +08:00
ZhenjiLiu
7cdb926a5f
[Improvement][Batch Query] Batch query ProcessDefinitions belongs to need failover ProcessInstance. (#12506) 2022-11-03 09:15:19 +08:00
Wenjun Ruan
9e0c9af1a5
Fix the waiting strategy cannot recovery if the serverstate is already in running (#12651) 2022-11-02 14:06:01 +08:00
Aaron Wang
08335b1032
[Improvement][Task] Improved way to collect yarn job's appIds (#12197)
* Provide aop way as an optional way to collect yarn job's applicationId, and import new module `dolphinscheduler-aop` to place the aop code.
* Add user property `appId.collect` for user to decide how to collect applicationId.
* Add new environment configuration for each type of yarn tasks to support aop in `dolphinscheduler_env.sh`
* Update docs to declare how to use aop way.
* Update `LogUtils` to support fetch applicationId in different ways based on the user property.

Co-authored-by: gabrywu <gabrywu@apache.com>
2022-10-31 16:52:53 +08:00
Wenjun Ruan
e6da1ccf81
Add worker-group-refresh-interval in master config (#12601)
* Add worker-group-refresh-interval in master config

* Set interval cannot smaller than 10s

* Update dolphinscheduler-master/src/main/java/org/apache/dolphinscheduler/server/master/config/MasterConfig.java

Co-authored-by: kezhenxu94 <kezhenxu94@apache.org>
2022-10-31 09:37:26 +08:00
kezhenxu94
065d5caccc
Only expose necessary actuator endpoints (#12571) 2022-10-28 07:40:32 +08:00
HanayoZz
489e7fe4e2
[Feature-10495][Resource Center] Resource Center Refactor (#12076)
* resource center refactor - S3 services connection

Co-authored-by: caishunfeng <caishunfeng2021@gmail.com>
2022-10-26 13:53:44 +08:00