knox / Colossal-AI-readme-10.bibtex
0 likes
0 forks
1 files
Last active
1 | @inproceedings{10.1145/3605573.3605613, |
2 | author = {Li, Shenggui and Liu, Hongxin and Bian, Zhengda and Fang, Jiarui and Huang, Haichen and Liu, Yuliang and Wang, Boxiang and You, Yang}, |
3 | title = {Colossal-AI: A Unified Deep Learning System For Large-Scale Parallel Training}, |
4 | year = {2023}, |
5 | isbn = {9798400708435}, |
6 | publisher = {Association for Computing Machinery}, |
7 | address = {New York, NY, USA}, |
8 | url = {https://doi.org/10.1145/3605573.3605613}, |
9 | doi = {10.1145/3605573.3605613}, |
10 | abstract = {The success of Transformer models has pushed the deep learning model scale to billions of parameters, but the memory limitation of a single GPU has led to an urgent need for training on multi-GPU clusters. However, the best practice for choosing the optimal parallel strategy is still lacking, as it requires domain expertise in both deep learning and parallel computing. The Colossal-AI system addressed the above challenge by introducing a unified interface to scale your sequential code of model training to distributed environments. It supports parallel training methods such as data, pipeline, tensor, and sequence parallelism and is integrated with heterogeneous training and zero redundancy optimizer. Compared to the baseline system, Colossal-AI can achieve up to 2.76 times training speedup on large-scale models.}, |
knox / Colossal-AI-readme-9.sh
0 likes
0 forks
1 files
Last active
1 | docker run -ti --gpus all --rm --ipc=host colossalai bash |
knox / Colossal-AI-readme-8.sh
0 likes
0 forks
1 files
Last active
1 | cd ColossalAI |
2 | docker build -t colossalai ./docker |
knox / Colossal-AI-readme-7.sh
0 likes
0 forks
1 files
Last active
1 | # clone the repository |
2 | git clone https://github.com/hpcaitech/ColossalAI.git |
3 | cd ColossalAI |
4 | |
5 | # download the cub library |
6 | wget https://github.com/NVIDIA/cub/archive/refs/tags/1.8.0.zip |
7 | unzip 1.8.0.zip |
8 | cp -r cub-1.8.0/cub/ colossalai/kernel/cuda_native/csrc/kernels/include/ |
9 | |
10 | # install |
knox / Colossal-AI-readme-6.sh
0 likes
0 forks
1 files
Last active
1 | BUILD_EXT=1 pip install . |
knox / Colossal-AI-readme-5.sh
0 likes
0 forks
1 files
Last active
1 | git clone https://github.com/hpcaitech/ColossalAI.git |
2 | cd ColossalAI |
3 | |
4 | # install colossalai |
5 | pip install . |
knox / Colossal-AI-readme-4.sh
0 likes
0 forks
1 files
Last active
1 | pip install colossalai-nightly |
knox / Colossal-AI-readme-3.sh
0 likes
0 forks
1 files
Last active
1 | BUILD_EXT=1 pip install colossalai |
knox / Colossal-AI-readme-2.sh
0 likes
0 forks
1 files
Last active
1 | pip install colossalai |
knox / Colossal-AI-readme-1.md
0 likes
0 forks
1 files
Last active
Model | Backbone | Tokens Consumed | MMLU (5-shot) | CMMLU (5-shot) | AGIEval (5-shot) | GAOKAO (0-shot) | CEval (5-shot) |
---|---|---|---|---|---|---|---|
Baichuan-7B | - | 1.2T | 42.32 (42.30) | 44.53 (44.02) | 38.72 | 36.74 | 42.80 |
Baichuan-13B-Base | - | 1.4T | 50.51 (51.60) | 55.73 (55.30) | 47.20 | 51.41 | 53.60 |
Baichuan2-7B-Base | - | 2.6T | 46.97 (54.16) | 57.67 (57.07) | 45.76 | 52.60 | 54.00 |
Baichuan2-13B-Base | - | 2.6T | 54.84 (59.17) | 62.62 (61.97) | 52.08 | 58.25 | 58.10 |
ChatGLM-6B | - | 1.0T | 39.67 (40.63) | 41.17 (-) | 40.10 | 36.53 | 38.90 |
ChatGLM2-6B | - | 1.4T | 44.74 (45.46) | 49.40 (-) | 46.36 | 45.49 | 51.70 |
InternLM-7B | - | 1.6T | 46.70 (51.00) | 52.00 (-) | 44.77 | 61.64 | 52.80 |
Qwen-7B | - | 2.2T | 54.29 (56.70) | 56.03 (58.80) | 52.47 | 56.42 | 59.60 |