Dernière activité 1728785872

LLaMA-Factory-1.md Brut
Model Model size Template
Baichuan 2 7B/13B baichuan2
BLOOM/BLOOMZ 560M/1.1B/1.7B/3B/7.1B/176B -
ChatGLM3 6B chatglm3
Command R 35B/104B cohere
DeepSeek (Code/MoE) 7B/16B/67B/236B deepseek
Falcon 7B/11B/40B/180B falcon
Gemma/Gemma 2/CodeGemma 2B/7B/9B/27B gemma
GLM-4 9B glm4
InternLM2/InternLM2.5 7B/20B intern2
Llama 7B/13B/33B/65B -
Llama 2 7B/13B/70B llama2
Llama 3-3.2 1B/3B/8B/70B llama3
LLaVA-1.5 7B/13B llava
LLaVA-NeXT 7B/8B/13B/34B/72B/110B llava_next
LLaVA-NeXT-Video 7B/34B llava_next_video
MiniCPM 1B/2B/4B cpm/cpm3
Mistral/Mixtral 7B/8x7B/8x22B mistral
OLMo 1B/7B -
PaliGemma 3B paligemma
Phi-1.5/Phi-2 1.3B/2.7B -
Phi-3 4B/7B/14B phi
Qwen (1-2.5) (Code/Math/MoE) 0.5B/1.5B/3B/7B/14B/32B/72B/110B qwen
Qwen2-VL 2B/7B/72B qwen2_vl
StarCoder 2 3B/7B/15B -
XVERSE 7B/13B/65B xverse
Yi/Yi-1.5 (Code) 1.5B/6B/9B/34B yi
Yi-VL 6B/34B yi_vl
Yuan 2 2B/51B/102B yuan