wucong
|
3fd657ad6e
|
!1262 统一 readme 格式(llama)
Merge pull request !1262 from wucong/dev6
|
2024-05-07 02:20:00 +00:00 |
|
wucong
|
dc6db1f858
|
!1260 统一 readme 格式(chatglm3 + intern)
Merge pull request !1260 from wucong/dev5
|
2024-05-07 02:19:45 +00:00 |
|
wucong
|
297fe8b01b
|
!1265 统一 readme 格式(llama3 + mixtral)
Merge pull request !1265 from wucong/dev9
|
2024-05-07 02:16:49 +00:00 |
|
guoxinjie
|
2ae8749f4a
|
!1252 统一 readme 格式(aquila)
Merge pull request !1252 from guoxinjie/readme
|
2024-04-30 07:50:27 +00:00 |
|
wucong
|
ae21a622b8
|
!1254 统一 readme 格式(baichuan2 + bloom)
Merge pull request !1254 from wucong/dev2
|
2024-04-30 07:39:33 +00:00 |
|
Liuchang
|
9a3d5641f5
|
!1255 优化聊天功能,增加Llama3聊天脚本和说明
Merge pull request !1255 from Liuchang/master
|
2024-04-30 02:58:39 +00:00 |
|
wucong
|
4e62972ecd
|
!1253 统一 readme 格式(baichuan)
Merge pull request !1253 from wucong/dev1
|
2024-04-29 03:52:08 +00:00 |
|
wwzhuo
|
b2915bd2ab
|
!1238 更新llama2 7b/13b 性能最优配置
Merge pull request !1238 from wwzhuo/master
|
2024-04-26 08:43:14 +00:00 |
|
Liuchang
|
a9f905b63f
|
!1251 Llama3 readme更新
Merge pull request !1251 from Liuchang/master
|
2024-04-26 07:27:08 +00:00 |
|
fengliangjun
|
791677c135
|
!1246 更新baichuan2-13B性能至1668
Merge pull request !1246 from fengliangjun/master
|
2024-04-26 01:47:52 +00:00 |
|
Liuchang
|
4109f95dfd
|
!1242 新增Llama3-8B和70B模型
Merge pull request !1242 from Liuchang/master
|
2024-04-25 01:24:31 +00:00 |
|
guhangsong
|
39d6fd7336
|
!1218 迁移megatron patch
Merge pull request !1218 from guhangsong/patch
|
2024-04-23 01:57:03 +00:00 |
|
fengliangjun
|
464131283f
|
!1239 去除FA适配时的一些冗余shape变换操作,提升性能
Merge pull request !1239 from fengliangjun/master
|
2024-04-18 01:42:50 +00:00 |
|
glhyy
|
75a81f58f9
|
!1233 README已知问题更新
Merge pull request !1233 from glhyy/master
|
2024-04-16 02:22:33 +00:00 |
|
LeiZhenzhen
|
5ad4ceddd4
|
!1231 对chatglm3增加partial_rope支持
Merge pull request !1231 from LeiZhenzhen/master
|
2024-04-15 13:11:56 +00:00 |
|
LeiZhenzhen
|
ab22271e13
|
!1227 新增chatglm3 预训练、推理、评估基线
Merge pull request !1227 from LeiZhenzhen/master
|
2024-04-11 03:23:33 +00:00 |
|
guoxinjie
|
2f32c76be2
|
!1224 移除 ModelLink 下的 megatron,并在 readme 中进行补充
Merge pull request !1224 from guoxinjie/remove_megatron
|
2024-04-09 07:44:00 +00:00 |
|
LeiZhenzhen
|
8524ea2735
|
!1225 增加chatglm3权重转换功能
Merge pull request !1225 from LeiZhenzhen/master
|
2024-04-09 06:05:25 +00:00 |
|
黄宇豪
|
e23e1e354b
|
!1215 fix: 统一Mixtral-README为预训练模板
Merge pull request !1215 from 黄宇豪/master
|
2024-04-03 02:08:55 +00:00 |
|
fengliangjun
|
0df09cd187
|
!1202 添加profiling功能
* add profiling
|
2024-03-30 08:58:41 +00:00 |
|
黄宇豪
|
62c39ddb9b
|
!1201 fix: 修复权重保存路径和数据集路径,格式化了README
Merge pull request !1201 from 黄宇豪/master
|
2024-03-30 06:32:18 +00:00 |
|
shishaoyu
|
ce01706c93
|
!1199 【DTS2024032814829】临时规避压测反复kill拉起情况下loss出现NaN的问题
Merge pull request !1199 from shishaoyu/master
|
2024-03-29 06:09:28 +00:00 |
|
黄宇豪
|
e8ae798db4
|
!1186 统一权重路径和README样式
Merge pull request !1186 from 黄宇豪/master
|
2024-03-28 03:43:17 +00:00 |
|
liuyanghan
|
aa6d2662cc
|
!1177 多机训练下,数据加载问题说明
Merge pull request !1177 from liuyanghan/master
|
2024-03-28 01:07:00 +00:00 |
|
wwzhuo
|
24c423201b
|
!1152 llama2 readme修改,更正tokenizer说明
* 修改llama2 readme中微调tokenizer变更说明
|
2024-03-27 08:25:03 +00:00 |
|
fengliangjun
|
f7af425efb
|
!1169 整理 tasks 文件目录,对外提供 evaluation和 inference.py
* provide inference and evaluation
|
2024-03-27 07:55:22 +00:00 |
|
黄宇豪
|
1cd3206f58
|
!1147 修复:添加了bf16-dtype字段以防止影响训练精度
Merge pull request !1147 from 黄宇豪/master
|
2024-03-26 01:09:59 +00:00 |
|
huangyiming
|
a2e9699361
|
!1146 删除bloom readme里的公网信息
Merge pull request !1146 from huangyiming/master
|
2024-03-25 06:51:34 +00:00 |
|
guhangsong
|
ddefd6151c
|
!1143 修改llama2 README文件
Merge pull request !1143 from guhangsong/readme
|
2024-03-25 03:18:27 +00:00 |
|
yuhui
|
17fcedcf86
|
!1098 Qwen模型readme修改
* qwen模型readme修改
|
2024-03-22 01:05:02 +00:00 |
|
guoxinjie
|
e9d19b2f87
|
!1105 修复推理乱码+修正llama2 readme
* fix infer bug for baichuan and modify llama2 readme
|
2024-03-21 09:53:10 +00:00 |
|
shengjy
|
d5e1353c0a
|
!1095 llama2 7B/13B新增多机训练参数说明
* add llama2 multi-machine training param
|
2024-03-20 09:15:56 +00:00 |
|
wwzhuo
|
a46f5ed5ad
|
!1088 更改llama 13b 精度模式,适配性能指标
* 修改精度模式
|
2024-03-20 08:36:03 +00:00 |
|
guoxinjie
|
11fbfdce01
|
!1082 增加 llama2-70B 脚本中的环境变量
* fix llama2-70B script bug
|
2024-03-19 13:13:40 +00:00 |
|
xiongliangcheng
|
a03487b01a
|
!1051 删除baichuan13B微调脚本
* 删除baichuan13B微调脚本
|
2024-03-19 12:15:25 +00:00 |
|
LeiZhenzhen
|
bf6456e04c
|
!1074 requirements.txt移除apex依赖,模型训练脚本规范化加上日志存档
* requirements.txt移除apex依赖,模型训练脚本规范化加上日志存档
|
2024-03-19 10:55:11 +00:00 |
|
fengliangjun
|
0f8a1851fe
|
!1062 修复moe代码合入导致的分布式优化器不可用bug
* fix bug for alibi and distributed opti
|
2024-03-19 02:32:12 +00:00 |
|
fengliangjun
|
8998057f4d
|
!1049 为baihcuan2-13B适配FA
* add FA for baichuan2-13B
|
2024-03-18 08:06:32 +00:00 |
|
zhangbin
|
4b459852b9
|
!1053 intern_7B修改readme
Merge pull request !1053 from zhangbin/master
|
2024-03-18 06:59:10 +00:00 |
|
fengliangjun
|
c4714245ed
|
Revert "add fa for baichuan2"
This reverts commit 9ff89b8765 .
|
2024-03-17 08:08:41 +00:00 |
|
fengliangjun
|
9ff89b8765
|
add fa for baichuan2
|
2024-03-17 16:07:06 +08:00 |
|
liuyanghan
|
560554a5e0
|
!1046 解决多机环境下训练,从机无法生成数据问题
* 解决多机环境下训练,从机无法生成数据问题
|
2024-03-16 09:47:38 +00:00 |
|
wwzhuo
|
8215e0e689
|
!1036 更正llama/llama2 readme中文件名大小写
Merge pull request !1036 from wwzhuo/master
|
2024-03-15 08:42:22 +00:00 |
|
liuyanghan
|
12bce62426
|
!1029 修复megaton 转 huggingface bug
Merge pull request !1029 from liuyanghan/master
|
2024-03-15 08:29:17 +00:00 |
|
yuhui
|
72821e5e90
|
!1025 更新qwen模型readme
* 更新bloom参数
* update qwen readme
|
2024-03-15 06:14:54 +00:00 |
|
yaojia2021
|
c566ce4fa9
|
!1016 modify Aquila7B README for checkpoint saving and loading
Merge pull request !1016 from yaojia2021/master
|
2024-03-15 01:33:37 +00:00 |
|
fengliangjun
|
9b4c33f7e7
|
!1010 仓库patch形式修改
Merge pull request !1010 from fengliangjun/master
|
2024-03-14 03:12:09 +00:00 |
|
xiongliangcheng
|
156aacc346
|
!968 修改README中失效的地址或路径
Merge pull request !968 from xiongliangcheng/master
|
2024-03-12 06:13:55 +00:00 |
|
liuyanghan
|
466b22fd0c
|
!954 readme安全说明加固 && readme失效链接替换
Merge pull request !954 from liuyanghan/master
|
2024-03-11 06:26:38 +00:00 |
|
guoyiwei111
|
cfea0402d5
|
!924 基于Megatron转HF 添加各模型Readme
Merge pull request !924 from guoyiwei111/gyw_modellink
|
2024-03-08 09:07:51 +00:00 |
|