gitee_code_template
88af8f0d1d
add baichuan readme
2023-10-07 23:20:26 +08:00
gitee_code_template
4a04dc7355
add baichuan readme
2023-10-07 22:54:15 +08:00
gitee_code_template
d1d42a7fc9
add baichuan readme
2023-10-07 22:43:16 +08:00
gitee_code_template
b2159ae221
add readme baichuan
2023-10-07 21:20:00 +08:00
gitee_code_template
1ad58c3e2d
add readme baichuan
2023-10-07 20:48:05 +08:00
gitee_code_template
062a639d1d
add baichuan readme
2023-10-07 19:37:43 +08:00
gitee_code_template
6c8a4f3dc4
修改 baichuan readme
2023-09-28 19:59:36 +08:00
gitee_code_template
36ce0d2c37
add readme for baichua
2023-09-28 18:07:38 +08:00
i-robot
20e090b93d
!91 change llama65B batch size
...
Merge pull request !91 from Jializheng/master
2023-09-28 09:21:19 +00:00
gitee_code_template
6eb65d6dc9
baichuan13B模型
2023-09-28 10:30:20 +08:00
jializheng
b1de673886
change llama65B batch size
2023-09-28 09:35:30 +08:00
fengliangjun
7a21f0bf58
up
2023-09-26 14:30:47 +08:00
machangjun
8e436e3a9a
add ffts mode
...
del torch_trans
del torch_trans and resove bloom ckpt and add bloom ffts+
add ffts mode
del torch_trans
del torch_trans and resove bloom ckpt and add bloom ffts+
replace fused_adam to adam
del unused code
2023-07-25 14:14:28 +08:00
Mrtutu
4532812837
更新bloom README: bloom7b在osacr-1G单机8卡训练
2023-07-26 14:10:51 +08:00
fengliangjun
260e8eea8f
create megatron core
2023-07-24 15:00:57 +08:00
chenzomi
92c27d5e2a
add a llama2 brach.
2023-07-21 15:20:25 +08:00
fengliangjun
db9c25bdd9
llama modify
2023-07-19 10:20:40 +08:00
liulinfeng
f6d7982b02
处理review意见
2023-07-14 15:34:07 +08:00
liulinfeng
243bfe5cfa
Bloom适配SP代码
2023-07-14 14:18:07 +08:00
chenzomi
4455b80650
change the readme format.
2023-07-14 10:54:42 +08:00
liulinfeng
36f787bc89
Author:刘林峰
...
修改说明:
1、提交权重加载、推理生成文本的代码实现
2、修改codecheck问题
3、修复断点续训卡死的问题
2023-07-07 15:03:48 +08:00
kingsleyandher
3afb525a97
提交SP算法
2023-07-10 14:44:42 +08:00
kingsleyandher
21609f3083
llama模型zeroshot 33B/65B适配代码提交;提交README.md文件
2023-07-05 14:25:29 +08:00
wiyr
d87e921410
added trick
2023-06-30 11:00:38 +08:00
kingsleyandher
2c104a087e
llama-zeroshot任务精度适配,对齐源论文中的效果。
2023-06-25 09:34:42 +08:00
wiyr
6304cab765
remove useless code
2023-06-20 16:54:12 +08:00
machangjun
2d8c6fee9d
add bloom st and adapt new data load method
...
modify bloom st run
modify bloom st run
modify times
add new pretrain_bloom.py
add new pretrain_bloom.py
add new pretrain_bloom.py
add new pretrain_bloom.py
add new pretrain_bloom.py
add new pretrain_bloom.py
add new pretrain_bloom.py
add new pretrain_bloom.py
add new pretrain_bloom.py
add st
2023-06-17 17:36:17 +08:00
kingsleyandher
4e3b7cd992
LlamaTokenizer适配及预训练脚本更改
2023-06-13 12:33:33 +08:00
wiyr
2f826f7351
can run with bloom7b and pass ci
2023-06-12 14:42:29 +08:00
fengliangjun
37ba281c40
readme update
2023-06-10 11:26:55 +08:00
chenzomi
37cc0b949d
change megatron to ascendspeed
2023-06-10 21:26:01 +08:00
fengliangjun
106a415556
inital AscendSpeed
2023-06-09 16:15:23 +08:00
wangyixian
d55d341fe1
Adapt the bloom 7.1b model to the AscendSpeed framework, which is jointly completed by liulinfeng and wangyixian
2023-06-06 22:30:19 +08:00
chenzomi
ce6af59f73
remove unused paraemter and models.
2023-05-26 10:53:07 +08:00
chenzomi
e4a120a662
fork megatron-deepspeed code.
2023-05-25 14:49:59 +08:00