All gists - Opengist

knox / Colossal-AI-readme-10.bibtex

0 likes

0 forks

1 files

Last active 1728788677

1	@inproceedings{10.1145/3605573.3605613,
2	author = {Li, Shenggui and Liu, Hongxin and Bian, Zhengda and Fang, Jiarui and Huang, Haichen and Liu, Yuliang and Wang, Boxiang and You, Yang},
3	title = {Colossal-AI: A Unified Deep Learning System For Large-Scale Parallel Training},
4	year = {2023},
5	isbn = {9798400708435},
6	publisher = {Association for Computing Machinery},
7	address = {New York, NY, USA},
8	url = {https://doi.org/10.1145/3605573.3605613},
9	doi = {10.1145/3605573.3605613},
10	abstract = {The success of Transformer models has pushed the deep learning model scale to billions of parameters, but the memory limitation of a single GPU has led to an urgent need for training on multi-GPU clusters. However, the best practice for choosing the optimal parallel strategy is still lacking, as it requires domain expertise in both deep learning and parallel computing. The Colossal-AI system addressed the above challenge by introducing a unified interface to scale your sequential code of model training to distributed environments. It supports parallel training methods such as data, pipeline, tensor, and sequence parallelism and is integrated with heterogeneous training and zero redundancy optimizer. Compared to the baseline system, Colossal-AI can achieve up to 2.76 times training speedup on large-scale models.},

knox / Colossal-AI-readme-9.sh

0 likes

0 forks

1 files

Last active 1728779796

1	docker run -ti --gpus all --rm --ipc=host colossalai bash

knox / Colossal-AI-readme-8.sh

0 likes

0 forks

1 files

Last active 1728779750

1	cd ColossalAI
2	docker build -t colossalai ./docker

knox / Colossal-AI-readme-7.sh

0 likes

0 forks

1 files

Last active 1728779694

1	# clone the repository
2	git clone https://github.com/hpcaitech/ColossalAI.git
3	cd ColossalAI
4
5	# download the cub library
6	wget https://github.com/NVIDIA/cub/archive/refs/tags/1.8.0.zip
7	unzip 1.8.0.zip
8	cp -r cub-1.8.0/cub/ colossalai/kernel/cuda_native/csrc/kernels/include/
9
10	# install

knox / Colossal-AI-readme-6.sh

0 likes

0 forks

1 files

Last active 1728779636

1	BUILD_EXT=1 pip install .

knox / Colossal-AI-readme-5.sh

0 likes

0 forks

1 files

Last active 1728779579

1	git clone https://github.com/hpcaitech/ColossalAI.git
2	cd ColossalAI
3
4	# install colossalai
5	pip install .

knox / Colossal-AI-readme-4.sh

0 likes

0 forks

1 files

Last active 1728779541

1	pip install colossalai-nightly

knox / Colossal-AI-readme-3.sh

0 likes

0 forks

1 files

Last active 1728779154

1	BUILD_EXT=1 pip install colossalai

knox / Colossal-AI-readme-2.sh

0 likes

0 forks

1 files

Last active 1728779104

1	pip install colossalai

knox / Colossal-AI-readme-1.md

0 likes

0 forks

1 files

Last active 1728778954

Model
Backbone
Tokens Consumed
MMLU (5-shot)
CMMLU (5-shot)
AGIEval (5-shot)
GAOKAO (0-shot)
CEval (5-shot)


Baichuan-7B
-
1.2T
42.32 (42.30)
44.53 (44.02)
38.72
36.74
42.80

Baichuan-13B-Base
-
1.4T
50.51 (51.60)
55.73 (55.30)
47.20
51.41
53.60

Baichuan2-7B-Base
-
2.6T
46.97 (54.16)
57.67 (57.07)
45.76
52.60
54.00

Baichuan2-13B-Base
-
2.6T
54.84 (59.17)
62.62 (61.97)
52.08
58.25
58.10

ChatGLM-6B
-
1.0T
39.67 (40.63)
41.17 (-)
40.10
36.53
38.90

ChatGLM2-6B
-
1.4T
44.74 (45.46)
49.40 (-)
46.36
45.49
51.70

InternLM-7B
-
1.6T
46.70 (51.00)
52.00 (-)
44.77
61.64
52.80

Qwen-7B
-
2.2T
54.29 (56.70)
56.03 (58.80)
52.47
56.42
59.60

Model	Backbone	Tokens Consumed	MMLU (5-shot)	CMMLU (5-shot)	AGIEval (5-shot)	GAOKAO (0-shot)	CEval (5-shot)
Baichuan-7B	-	1.2T	42.32 (42.30)	44.53 (44.02)	38.72	36.74	42.80
Baichuan-13B-Base	-	1.4T	50.51 (51.60)	55.73 (55.30)	47.20	51.41	53.60
Baichuan2-7B-Base	-	2.6T	46.97 (54.16)	57.67 (57.07)	45.76	52.60	54.00
Baichuan2-13B-Base	-	2.6T	54.84 (59.17)	62.62 (61.97)	52.08	58.25	58.10
ChatGLM-6B	-	1.0T	39.67 (40.63)	41.17 (-)	40.10	36.53	38.90
ChatGLM2-6B	-	1.4T	44.74 (45.46)	49.40 (-)	46.36	45.49	51.70
InternLM-7B	-	1.6T	46.70 (51.00)	52.00 (-)	44.77	61.64	52.80
Qwen-7B	-	2.2T	54.29 (56.70)	56.03 (58.80)	52.47	56.42	59.60