!1342 脚本内删除overlap-param-gather,删除无效网址链接和无效markdown

Merge pull request !1342 from DONGHAORAN/master
This commit is contained in:
DONGHAORAN 2024-06-12 12:42:28 +00:00 committed by i-robot
parent d5a1d0dd13
commit 27ad511d3c
9 changed files with 1 additions and 112 deletions

View File

@ -776,7 +776,6 @@ ModelLink支持张量并行、流水线并行、序列并行、重计算、分
| 重计算 | --recompute-granularity |
| 分布式优化器 | --use-distributed-optimizer |
| DDP allreduce 掩盖 | --overlap-grad-reduce |
| DDP allgather 掩盖 | --overlap-param-gather |
| Flash attention | --use-flash-attn |
| Fused rmsnorm | --use-fused-rmsnorm |
| Fused swiglu | --use-fused-swiglu |
@ -799,7 +798,6 @@ torchrun $DISTRIBUTED_ARGS pretrain_gpt.py \
--use-fused-rmsnorm \
--use-fused-swiglu \
--overlap-grad-reduce \
--overlap-param-gather \
--use-fused-rotary-pos-emb \
--use-mc2 \
--sliding-window 4096 \

View File

@ -773,7 +773,6 @@ ModelLink supports various acceleration algorithms such as tensor parallelism, p
| Recomputation | --recompute-granularity |
| Distributed Optimizer | --use-distributed-optimizer |
| overlap DDP allreduce | --overlap-grad-reduce |
| overlap DDP allgather | --overlap-param-gather |
| Flash attention | --use-flash-attn |
| Fused rmsnorm | --use-fused-rmsnorm |
| Fused swiglu | --use-fused-swiglu |
@ -797,7 +796,6 @@ torchrun $DISTRIBUTED_ARGS pretrain_gpt.py \
--use-fused-rmsnorm \
--use-fused-swiglu \
--overlap-grad-reduce \
--overlap-param-gather \
--use-fused-rotary-pos-emb \
--use-mc2 \
--sliding-window 4096 \

View File

@ -69,7 +69,6 @@ GPT_ARGS="
--use-fused-swiglu \
--use-fused-rotary-pos-emb \
--overlap-grad-reduce \
--overlap-param-gather \
--bf16
"

View File

@ -78,7 +78,6 @@ GPT_ARGS="
--no-load-rng \
--use-distributed-optimizer \
--overlap-grad-reduce \
--overlap-param-gather \
--load ${CKPT_LOAD_DIR} \
--save ${CKPT_SAVE_DIR} \
--bf16

View File

@ -99,11 +99,6 @@ def pad_batch(batch, args):
def top_k_logits(logits, top_k=0, top_p=0.0, filter_value=-float('Inf')):
"""
This function has been mostly taken from huggingface conversational ai code at
https://medium.com/huggingface/how-to-build-a-state-of-the-art-conversational-ai-with-transfer
-learning-2d818ac26313
"""
if top_k > 0:
# Remove all tokens with a probability less than the

View File

@ -1,94 +0,0 @@
| 类型 | 开源代码地址 | 文件名 | 公网IP地址/公网URL地址/域名/邮箱地址 | 用途说明 |
|--------|----------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------|-----------|
| 开源代码引入 | https://github.com/NVIDIA/Megatron-LM/blob/main/megatron/text_generation/beam_utils.py | megatron/text_generation/beam_utils.py:9 | http://www.apache.org/licenses/LICENSE-2.0 | License地址 |
| 开源代码引入 | https://github.com/NVIDIA/Megatron-LM/blob/main/megatron/text_generation/sampling.py | megatron/text_generation/sampling.py:5 | https://github.com/ari-holtzman/degen/blob/master/gen.py | 参考代码地址 |
| 开源代码引入 | https://github.com/NVIDIA/Megatron-LM/blob/main/megatron/text_generation/sampling.py | megatron/text_generation/sampling.py:6 | https://huggingface.co/transformers/_modules/transformers/generation_logits_process.html | 参考代码地址 |
| 开源代码引入 | https://github.com/NVIDIA/Megatron-LM/blob/main/megatron/text_generation/sampling.py | megatron/text_generation/sampling.py:33 | https://github.com/ari-holtzman/degen/blob/master/gen.py | 参考代码地址 |
| 开源代码引入 | https://github.com/NVIDIA/Megatron-LM/blob/main/megatron/theoretical_memory_usage.py | megatron/theoretical_memory_usage.py:73 | https://arxiv.org/pdf/2205.05198.pdf | 参考论文地址 |
| 开源代码引入 | https://github.com/NVIDIA/Megatron-LM/blob/main/megatron/fused_kernels/compat.h | megatron/fused_kernels/compat.h:4 | https://github.com/NVIDIA/apex | 参考代码地址 |
| 开源代码引入 | https://github.com/NVIDIA/Megatron-LM/blob/main/megatron/static/index.html | megatron/static/index.html:86 | https://cdnjs.cloudflare.com/ajax/libs/jquery/3.5.1/jquery.min.js | 前端代码库地址 |
| 开源代码引入 | https://github.com/NVIDIA/Megatron-LM/blob/main/megatron/core/datasets/blended_megatron_dataset_config.py | megatron/core/datasets/blended_megatron_dataset_config.py:66 | https://docs.python.org/3/library/dataclasses.html#post-init-processing | 详情地址 |
| 开源代码引入 | https://github.com/NVIDIA/Megatron-LM/blob/main/megatron/core/models/retro/encoder_attention.py | megatron/core/models/retro/encoder_attention.py:22 | https://arxiv.org/abs/2112.04426 | 参考论文地址 |
| 开源代码引入 | https://github.com/NVIDIA/Megatron-LM/blob/main/megatron/core/models/retro/decoder_attention.py | megatron/core/models/retro/decoder_attention.py:27 | https://arxiv.org/abs/2112.04426 | 参考论文地址 |
| 开源代码引入 | https://github.com/NVIDIA/Megatron-LM/blob/main/megatron/core/models/common/embeddings/rotary_pos_embedding.py | megatron/core/models/common/embeddings/rotary_pos_embedding.py:147 | https://kexue.fm/archives/8265 | 详情地址 |
| 开源代码引入 | https://github.com/NVIDIA/Megatron-LM/blob/main/megatron/core/pipeline_parallel/schedules.py | megatron/core/pipeline_parallel/schedules.py:493 | https://arxiv.org/pdf/2205.05198.pdf | 参考论文地址 |
| 开源代码引入 | https://github.com/NVIDIA/Megatron-LM/blob/main/megatron/core/transformer/transformer_config.py | megatron/core/transformer/transformer_config.py:87 | https://arxiv.org/abs/2205.05198 | 参考论文地址 |
| 开源代码引入 | https://github.com/NVIDIA/Megatron-LM/blob/main/megatron/core/transformer/transformer_config.py | megatron/core/transformer/transformer_config.py:107 | https://docs.nvidia.com/deeplearning/transformer-engine/user-guide/api/common.html | 参考代码地址 |
| 开源代码引入 | https://github.com/NVIDIA/Megatron-LM/blob/main/megatron/core/transformer/mlp.py | megatron/core/transformer/mlp.py:49 | https://arxiv.org/pdf/2002.05202.pdf | 参考论文地址 |
| 开源代码引入 | https://github.com/NVIDIA/Megatron-LM/blob/main/megatron/core/transformer/transformer_config.py | megatron/core/transformer/transformer_config.py:196 | https://docs.python.org/3/library/dataclasses.html#post-init-processing | 详情地址 |
| 开源代码引入 | https://github.com/NVIDIA/Megatron-LM/blob/main/megatron/core/transformer/dot_product_attention.py | megatron/core/transformer/dot_product_attention.py:23 | https://arxiv.org/abs/2205.05198 | 参考论文地址 |
| 开源代码引入 | https://github.com/NVIDIA/Megatron-LM/blob/main/megatron/core/pipeline_parallel/schedules.py | megatron/core/pipeline_parallel/schedules.py:1122 | https://arxiv.org/pdf/2205.05198.pdf | 参考论文地址 |
| 开源代码引入 | https://github.com/NVIDIA/Megatron-LM/blob/main/megatron/core/model_parallel_config.py | megatron/core/model_parallel_config.py:26 | https://arxiv.org/pdf/2104.04473.pdf | 参考论文地址 |
| 开源代码引入 | https://github.com/NVIDIA/Megatron-LM/blob/main/megatron/core/model_parallel_config.py | megatron/core/model_parallel_config.py:31 | https://arxiv.org/abs/2205.05198 | 参考论文地址 |
| 开源代码引入 | https://github.com/NVIDIA/Megatron-LM/blob/main/megatron/core/tensor_parallel/random.py | megatron/core/tensor_parallel/random.py:4 | https://github.com/pytorch/pytorch | 参考代码地址 |
| 开源代码引入 | https://github.com/NVIDIA/Megatron-LM/blob/main/megatron/core/tensor_parallel/layers.py | megatron/core/tensor_parallel/layers.py:4 | https://github.com/pytorch/pytorch | 参考代码地址 |
| 开源代码引入 | https://github.com/NVIDIA/Megatron-LM/blob/main/megatron/core/model_parallel_config.py | megatron/core/model_parallel_config.py:200 | https://docs.python.org/3/library/dataclasses.html#post-init-processing | 详情地址 |
| 开源代码引入 | https://github.com/NVIDIA/Megatron-LM/blob/main/megatron/core/tensor_parallel/cross_entropy.py | megatron/core/tensor_parallel/cross_entropy.py:80 | https://github.com/NVIDIA/NeMo/blob/main/nemo/collections/common/losses/smoothed_cross_entropy.py | 参考代码地址 |
| 开源代码引入 | https://github.com/NVIDIA/Megatron-LM/blob/main/megatron/core/tensor_parallel/layers.py | megatron/core/tensor_parallel/layers.py:370 | https://github.com/pytorch/pytorch/blob/c47cf9bc7f9e02f649ab4ed53fe4d35732c92ab6/torch/_refs/__init__.py | 参考代码地址 |
| 开源代码引入 | https://github.com/NVIDIA/Megatron-LM/blob/main/megatron/core/package_info.py | megatron/core/package_info.py:17 | nemo-toolkit@nvidia.com | 源码作者邮箱地址 |
| 开源代码引入 | https://github.com/NVIDIA/Megatron-LM/blob/main/megatron/core/package_info.py | megatron/core/package_info.py:19 | https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/ | 详情地址 |
| 开源代码引入 | https://github.com/NVIDIA/Megatron-LM/blob/main/megatron/core/package_info.py | megatron/core/package_info.py:21 | https://github.com/NVIDIA/Megatron-LM/megatron/core | 源码仓库地址 |
| 开源代码引入 | https://github.com/NVIDIA/Megatron-LM/blob/main/megatron/core/package_info.py | megatron/core/package_info.py:22 | https://github.com/NVIDIA/Megatron-LM/releases | 源码仓库地址 |
| 开源代码引入 | https://github.com/NVIDIA/Megatron-LM/blob/main/megatron/tokenizer/bert_tokenization.py | megatron/tokenizer/bert_tokenization.py:8 | http://www.apache.org/licenses/LICENSE-2.0 | License地址 |
| 开源代码引入 | https://github.com/NVIDIA/Megatron-LM/blob/main/megatron/tokenizer/gpt2_tokenization.py | megatron/tokenizer/gpt2_tokenization.py:8 | http://www.apache.org/licenses/LICENSE-2.0 | License地址 |
| 开源代码引入 | https://github.com/NVIDIA/Megatron-LM/blob/main/megatron/tokenizer/gpt2_tokenization.py | megatron/tokenizer/gpt2_tokenization.py:41 | https://s3.amazonaws.com/models.huggingface.co/bert/gpt2-vocab.json | 预训练文件地址 |
| 开源代码引入 | https://github.com/NVIDIA/Megatron-LM/blob/main/megatron/tokenizer/gpt2_tokenization.py | megatron/tokenizer/gpt2_tokenization.py:44 | https://s3.amazonaws.com/models.huggingface.co/bert/gpt2-merges.txt | 预训练文件地址 |
| 开源代码引入 | https://github.com/NVIDIA/Megatron-LM/blob/main/megatron/utils.py | megatron/utils.py:69 | https://github.com/NVIDIA/apex | 参考代码地址 |
| 开源代码引入 | https://github.com/NVIDIA/Megatron-LM/blob/main/megatron/tokenizer/bert_tokenization.py | megatron/tokenizer/bert_tokenization.py:299 | https://en.wikipedia.org/wiki/CJK_Unified_Ideographs_(Unicode_block) | 详情地址 |
| 开源代码引入 | https://github.com/NVIDIA/Megatron-LM/blob/main/megatron/tokenizer/tokenizer.py | megatron/tokenizer/tokenizer.py:389 | https://github.com/NVIDIA/NeMo/blob/c8fa217e811d60d11d014827c7f3845ff6c99ae7/nemo/collections/common/tokenizers/sentencepiece_tokenizer.py | 参考代码地址 |
| 开源代码引入 | https://github.com/NVIDIA/Megatron-LM/blob/main/megatron/tokenizer/tokenizer.py | megatron/tokenizer/tokenizer.py:415 | https://github.com/NVIDIA/NeMo/blob/c8fa217e811d60d11d014827c7f3845ff6c99ae7/nemo/collections/common/tokenizers/sentencepiece_tokenizer.py | 参考代码地址 |
| 开源代码引入 | https://github.com/NVIDIA/Megatron-LM/blob/main/megatron/model/fused_layer_norm.py | megatron/model/fused_layer_norm.py:4 | https://github.com/NVIDIA/apex | 参考代码地址 |
| 开源代码引入 | https://github.com/NVIDIA/Megatron-LM/blob/main/megatron/model/fused_layer_norm.py | megatron/model/fused_layer_norm.py:83 | https://github.com/NVIDIA/apex | 参考代码地址 |
| 开源代码引入 | https://github.com/NVIDIA/Megatron-LM/blob/main/megatron/model/vision/esvit_swin_backbone.py | megatron/model/vision/esvit_swin_backbone.py:6 | chunyl@microsoft.com | 源码作者邮箱地址 |
| 开源代码引入 | https://github.com/NVIDIA/Megatron-LM/blob/main/megatron/model/vision/esvit_swin_backbone.py | megatron/model/vision/esvit_swin_backbone.py:509 | https://arxiv.org/pdf/2103.14030 | 参考论文地址 |
| 开源代码引入 | https://github.com/NVIDIA/Megatron-LM/blob/main/megatron/model/vision/dino.py | megatron/model/vision/dino.py:6 | https://github.com/facebookresearch/dino/blob/main/main_dino.py | 参考代码地址 |
| 开源代码引入 | https://github.com/NVIDIA/Megatron-LM/blob/main/megatron/model/vision/knn_monitor.py | megatron/model/vision/knn_monitor.py:101 | https://arxiv.org/abs/1805.01978 | 参考论文地址 |
| 开源代码引入 | https://github.com/NVIDIA/Megatron-LM/blob/main/megatron/model/vision/knn_monitor.py | megatron/model/vision/knn_monitor.py:102 | http://github.com/zhirongw/lemniscate.pytorch | 参考代码地址 |
| 开源代码引入 | https://github.com/NVIDIA/Megatron-LM/blob/main/megatron/model/vision/knn_monitor.py | megatron/model/vision/knn_monitor.py:103 | https://github.com/leftthomas/SimCLR | 参考代码地址 |
| 开源代码引入 | https://github.com/NVIDIA/Megatron-LM/blob/main/megatron/model/vision/swin_backbone.py | megatron/model/vision/swin_backbone.py:474 | https://arxiv.org/pdf/2103.14030 | 参考论文地址 |
| 开源代码引入 | https://github.com/NVIDIA/Megatron-LM/blob/main/megatron/model/transformer.py | megatron/model/transformer.py:96 | https://arxiv.org/pdf/2002.05202.pdf | 参考论文地址 |
| 开源代码引入 | https://github.com/NVIDIA/Megatron-LM/blob/main/megatron/model/language_model.py | megatron/model/language_model.py:377 | https://github.com/kingoflolz/mesh-transformer-jax/ | 参考代码地址 |
| 开源代码引入 | https://github.com/NVIDIA/Megatron-LM/blob/main/megatron/optimizer_param_scheduler.py | megatron/optimizer_param_scheduler.py:81 | https://openreview.net/pdf?id=BJYwwY9ll | 参考论文地址 |
| 开源代码引入 | https://github.com/NVIDIA/Megatron-LM/blob/main/megatron/data/autoaugment.py | megatron/data/autoaugment.py:29 | https://github.com/DeepVoltaire/AutoAugment | 参考代码地址 |
| 开源代码引入 | https://github.com/NVIDIA/Megatron-LM/blob/main/megatron/data/autoaugment.py | megatron/data/autoaugment.py:36 | https://arxiv.org/abs/1805.09501 | 参考论文地址 |
| 开源代码引入 | https://github.com/NVIDIA/Megatron-LM/blob/main/megatron/data/dataset_utils.py | megatron/data/dataset_utils.py:8 | http://www.apache.org/licenses/LICENSE-2.0 | License地址 |
| 开源代码引入 | https://github.com/NVIDIA/Megatron-LM/blob/main/megatron/data/dataset_utils.py | megatron/data/dataset_utils.py:18 | https://github.com/google-research/albert/blob/master/create_pretraining_data.py | 参考代码地址 |
| 开源代码引入 | https://github.com/NVIDIA/Megatron-LM/blob/main/megatron/arguments.py | megatron/arguments.py:889 | https://arxiv.org/abs/2205.14135 | 参考论文地址 |
| 开源代码引入 | https://github.com/NVIDIA/Megatron-LM/blob/main/megatron/data/dataset_utils.py | megatron/data/dataset_utils.py:265 | https://arxiv.org/pdf/1907.10529.pdf | 参考论文地址 |
| 开源代码引入 | https://github.com/NVIDIA/Megatron-LM/blob/main/megatron/data/image_folder.py | megatron/data/image_folder.py:32 | https://github.com/pytorch/vision/blob/main/torchvision/datasets/folder.py | 参考代码地址 |
| 开源代码引入 | https://github.com/NVIDIA/Megatron-LM/blob/main/megatron/data/image_folder.py | megatron/data/image_folder.py:238 | https://github.com/python-pillow/Pillow/issues/835 | 详情地址 |
| 开源代码引入 | 不涉及 | tests/ut/module/test_fold_schedules.py:8 | http://www.apache.org/licenses/LICENSE-2.0 | License地址 |
| 开源代码引入 | 不涉及 | tests/ut/module/test_auto_recomputing.py:8 | http://www.apache.org/licenses/LICENSE-2.0 | License地址 |
| 开源代码引入 | 不涉及 | tests/ut/module/test_triangle_attn.py:8 | http://www.apache.org/licenses/LICENSE-2.0 | License地址 |
| 开源代码引入 | 不涉及 | tests/st/test_bloom/run_bloom_ptd.py:8 | http://www.apache.org/licenses/LICENSE-2.0 | License地址 |
| 开源代码引入 | 不涉及 | tests/st/test_bloom/run_llama_ptd.py:8 | http://www.apache.org/licenses/LICENSE-2.0 | License地址 |
| 开源代码引入 | 不涉及 | tests/st/test_bloom/run_gpt_ptd.py:8 | http://www.apache.org/licenses/LICENSE-2.0 | License地址 |
| 开源代码引入 | 不涉及 | setup.py:8 | http://www.apache.org/licenses/LICENSE-2.0 | License地址 |
| 开源代码引入 | 不涉及 | setup.py:85 | https://packaging.python.org/en/latest/single_source_version.html | 详情地址 |
| 开源代码引入 | 不涉及 | tools/retro/utils.py:6 | https://github.com/NVIDIA/Megatron-LM/blob/main/tools/retro/utils.py | 源代码地址 |
| 开源代码引入 | 不涉及 | tools/preprocess_data.py:8 | http://www.apache.org/licenses/LICENSE-2.0 | License地址 |
| 开源代码引入 | 不涉及 | tools/checkpoint/saver_megatron.py:8 | http://www.apache.org/licenses/LICENSE-2.0 | License地址 |
| 开源代码引入 | 不涉及 | tools/checkpoint/util.py:8 | http://www.apache.org/licenses/LICENSE-2.0 | License地址 |
| 开源代码引入 | 不涉及 | tools/checkpoint/loader_llama2_hf.py | http://www.apache.org/licenses/LICENSE-2.0 | License地址 |
| 开源代码引入 | 不涉及 | modellink/text_generation/beam_utils.py | http://www.apache.org/licenses/LICENSE-2.0 | License地址 |
| 开源代码引入 | 不涉及 | modellink/text_generation/utils.py | https://medium.com/huggingface/how-to-build-a-state-of-the-art-conversational-ai-with-transfer | 源代码地址 |
| 开源代码引入 | 不涉及 | modellink/__init__.py | http://www.apache.org/licenses/LICENSE-2.0 | License地址 |
| 开源代码引入 | 不涉及 | modellink/adapter_lora/__init__.py | http://www.apache.org/licenses/LICENSE-2.0 | License地址 |
| 开源代码引入 | 不涉及 | modellink/tokenizer/__init__.py | http://www.apache.org/licenses/LICENSE-2.0 | License地址 |
| 开源代码引入 | 不涉及 | modellink/tokenizer/tokenizer.py | http://www.apache.org/licenses/LICENSE-2.0 | License地址 |
| 开源代码引入 | 不涉及 | modellink/utils.py | http://www.apache.org/licenses/LICENSE-2.0 | License地址 |
| 开源代码引入 | 不涉及 | modellink/model/module.py | http://www.apache.org/licenses/LICENSE-2.0 | License地址 |
| 开源代码引入 | 不涉及 | modellink/error_utils.py | http://www.apache.org/licenses/LICENSE-2.0 | License地址 |
| 开源代码引入 | 不涉及 | modellink/data/prompter.py | http://www.apache.org/licenses/LICENSE-2.0 | License地址 |
| 开源代码引入 | 不涉及 | modellink/data/data_handler.py | http://www.apache.org/licenses/LICENSE-2.0 | License地址 |
| 开源代码引入 | 不涉及 | tasks/evaluation/eval_api/chat.py | http://www.apache.org/licenses/LICENSE-2.0 | License地址 |
| 开源代码引入 | 不涉及 | tasks/evaluation/eval_api/dataset_eval.py | http://www.apache.org/licenses/LICENSE-2.0 | License地址 |
| 开源代码引入 | 不涉及 | tasks/evaluation/evaluation_llama.py | http://www.apache.org/licenses/LICENSE-2.0 | License地址 |
| 开源代码引入 | 不涉及 | tasks/evaluation/eval_impl/agi_eval.py | http://www.apache.org/licenses/LICENSE-2.0 | License地址 |
| 开源代码引入 | 不涉及 | tasks/evaluation/eval_impl/bbh_eval.py | http://www.apache.org/licenses/LICENSE-2.0 | License地址 |
| 开源代码引入 | 不涉及 | tasks/evaluation/eval_impl/boolq_eval.py | http://www.apache.org/licenses/LICENSE-2.0 | License地址 |
| 开源代码引入 | 不涉及 | tasks/evaluation/eval_impl/ceval_exam.py | http://www.apache.org/licenses/LICENSE-2.0 | License地址 |
| 开源代码引入 | 不涉及 | tasks/evaluation/eval_impl/gsm8k_eval.py | http://www.apache.org/licenses/LICENSE-2.0 | License地址 |
| 开源代码引入 | 不涉及 | tasks/evaluation/eval_impl/human_eval.py | http://www.apache.org/licenses/LICENSE-2.0 | License地址 |
| 开源代码引入 | 不涉及 | tasks/evaluation/eval_impl/mmlu_eval.py | http://www.apache.org/licenses/LICENSE-2.0 | License地址 |
| 开源代码引入 | 不涉及 | tasks/evaluation/eval_impl/template.py | http://www.apache.org/licenses/LICENSE-2.0 | License地址 |

View File

@ -80,9 +80,7 @@ except Exception:
setuptools.setup(
package_data={'modellink':['modellink/data/Makefile']},
name=__package_name__,
# Versions should comply with PEP440. For a discussion on single-sourcing
# the version across setup.py and the project code, see
# https://packaging.python.org/en/latest/single_source_version.html
# Versions should comply with PEP440.
version=__version__,
description=__description__,
long_description=long_description,

View File

@ -74,7 +74,6 @@ GPT_ARGS="
--use-fused-rotary-pos-emb \
--use-distributed-optimizer \
--overlap-grad-reduce \
--overlap-param-gather \
--bf16
"

View File

@ -3,9 +3,6 @@
# This source code is licensed under the Apache license found in the
# LICENSE file in the root directory of this source tree.
# copied from https://github.com/NVIDIA/Megatron-LM/blob/main/tools/retro/utils.py
# reworked/refactored some parts to make it run.
import os
import types
import torch