通義千問Qwen2.5-Coder全系列來咯！強大、多樣、實用！

一、引言

千問團隊開源「強大」、「多樣」、「實用」的 Qwen2.5-Coder 全系列，致力於持續推動 Open Code LLMs 的發展。

● 強大:Qwen2.5-Coder-32B-Instruct 成為目前 SOTA 的開原始碼模型，程式碼能力追平 GPT-4o。在展現出強大且全面的程式碼能力的同時，具備良好的通用和數學能力;

● 多樣:在之前開源的兩個尺寸 1.5B/7B 的基礎上，本次開源共帶來四個尺寸的模型，包括 0.5B/3B/14B/32B。截止目前 Qwen2.5-Coder 已經覆蓋了主流的六個模型尺寸，以滿足不同開發者的需要;

● 實用:在兩種場景下探索 Qwen2.5-Coder 的實用性，包括程式碼助手和 Artifacts，一些樣例展示出 Qwen2.5-Coder 在實際場景中應用的潛力;

二、強大：程式碼能力達到開源模型SOTA

● 程式碼生成:Qwen2.5-Coder-32B-Instruct 作為本次開源的旗艦模型，在多個流行的程式碼生成基準(EvalPlus, LiveCodeBench, BigCodeBench)上都取得了開源模型中的最佳表現，並且達到和 GPT-4o 有競爭力的表現。

● 程式碼修復:程式碼修復是一個重要的程式設計能力，Qwen2.5-Coder-32B-Instruct 可以幫助使用者修復程式碼中的錯誤，讓程式設計更加高效。Aider 是流行的程式碼修復的基準，Qwen2.5-Coder-32B-Instruct 達到 73.7 分，在 Aider 上的表現與 GPT-4o 相當。

● 程式碼推理:程式碼推理指的是模型能否學習程式碼執行的過程，準確的預測模型的輸入與輸出。上個月釋出的 Qwen2.5-Coder-7B-Instruct 已經在程式碼推理能力上展現出了不俗的表現，本次 32B 模型在程式碼推理上更進一步。

● 多程式語言:智慧的程式設計助手應該熟悉所有程式語言，Qwen2.5-Coder-32B-Instruct 在 40 多種程式語言上表現出色，在 McEval 上取得了 65.9 的分數，其中 Haskell, Racket 等語言表現令人印象深刻，這得益於在預訓練階段獨特的資料清洗和配比。

另外，Qwen2.5-Coder-32B-Instruct 的多程式語言的程式碼修復能力仍然令人驚喜，這將有助於使用者理解和修改自己熟悉的程式語言，極大緩解陌生語言的學習成本。與 McEval 類似，MdEval 是多程式語言的程式碼修復基準，Qwen2.5-Coder-32B-Instruct 在 MdEval 上取得了 75.2 的分數，在所有開源模型中排名第一。

● 人類偏好對齊:為了檢驗 Qwen2.5-Coder-32B-Instruct 在人類偏好上的對齊表現，構建了一個來自內部標註的程式碼偏好評估基準 Code Arena(類似 Arena Hard)。採用 GPT-4o 作為偏好對齊的評測模型，採用 'A vs. B win' 的評測方式，即在測試集例項中，模型 A 的分數超過模型 B 的百分比。下圖結果表現出 Qwen2.5-Coder-32B-Instruct 在偏好對齊方面的優勢。

三、全面：豐富的模型尺寸

本次 Qwen2.5-Coder 開源了豐富的模型尺寸，共包含 0.5B/1.5B/3B/7B/14B/32B 六個尺寸，不僅能夠滿足開發者在不同資源場景下的需求，還能給研究社群提供良好的實驗場。下表是詳細的模型資訊:

一直相信 Scaling Law 哲學。評估了不同尺寸的 Qwen2.5-Coder 在所有資料集上的表現，以驗證 Scaling 在 Code LLMs 上的有效性。

對於每一個尺寸，都開源了 Base 和 Instruct 模型，其中 Instruct 模型作是一個可以直接聊天的官方對齊模型，Base 模型作為開發者微調自己模型的基座。

下面是不同尺寸的 Base 模型表現:

下面是不同尺寸的 Instruct 模型表現:

為了更加直觀，展示了不同尺寸 Qwen2.5-Coder 和其他開源模型在核心資料集上的對比。

● 對於 Base 模型，選擇 MBPP-3shot 作為評估指標，大量的實驗表明，MBPP-3shot 更適合評估基礎模型，且能夠和模型的真實效果有較好的相關性。

● 對於 Instruct 模型，選擇 LiveCodeBench 最新的 4 個月(2024.07 – 2024.11)的題目作為評估，這些最新公佈的，不可能洩露到訓練集的題目，能夠反映模型的 OOD 能力。

模型尺寸和模型效果之間符合預期的存在正相關，並且 Qwen2.5-Coder 在所有尺寸下都取得了 SOTA 的表現，這鼓勵著繼續探索更大尺寸的 Coder。

模型許可

Qwen2.5-Coder的0.5B、1.5B、7B、14B、32B模型均採用Apache 2.0許可證，3B模型使用“Research Only”許可。

三、模型連結和體驗

Qwen2.5-Coder 模型連結:

https://modelscope.cn/collections/Qwen25-Coder-9d375446e8f5814a

模型集合 demo 連結:

https://modelscope.cn/studios/Qwen/Qwen2.5-Coder-demo

小程式體驗：

Artifacts 體驗連結:

https://modelscope.cn/studios/Qwen/Qwen2.5-Coder-Artifacts

四、模型推理

transformers: 單卡執行 Qwen2.5-32B-Instrtuct 量化模型。

from modelscope import AutoModelForCausalLM, AutoTokenizermodel_name = "Qwen/Qwen2.5-Coder-32B-Instruct-GPTQ-Int4"model = AutoModelForCausalLM.from_pretrained( model_name,torch_dtype="auto",device_map="auto" )tokenizer = AutoTokenizer.from_pretrained(model_name)prompt = "write a quick sort algorithm."messages = [{"role": "system", "content": "You are Qwen, created by Alibaba Cloud. You are a helpful assistant."},{"role": "user", "content": prompt} ]text = tokenizer.apply_chat_template( messages,tokenize=False,add_generation_prompt=True )model_inputs = tokenizer([text], return_tensors="pt").to(model.device)generated_ids = model.generate( **model_inputs,max_new_tokens=512 )generated_ids = [output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids) ]response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]

視訊記憶體佔用:

Ollama:一行命令使用 Ollama 執行魔搭 Qwen2.5-Coder GGUF 模型

#設定下啟用ollama serve#ollama run ModelScope 任意 GGUF 模型ollama run modelscope.cn/Qwen/Qwen2.5-32B-Instruct-GGUF

在安裝了 Ollama 的環境上(建議使用>=0.3.12 版本)，直接透過上面的命令列，就可以直接在本地執行模型。

vLLM，推理加速

pip install vllm -Uexport VLLM_USE_MODELSCOPE=True vllm serve Qwen/Qwen2.5-Coder-7B-Instruct# 推理程式碼:from openai import OpenAI# Modify OpenAI's API key and API base to use vLLM's API server. openai_api_key = "EMPTY"openai_api_base = "http://localhost:8000/v1"client = OpenAI(api_key=openai_api_key,base_url=openai_api_base, )completion = client.completions.create(model="Qwen/Qwen2.5-Coder-1.5B-Instruct",print("Completion result:", completion)prompt="San Francisco is a")

五、模型微調

我們介紹使用 ms-swift 對 qwen2.5-coder 進行自我認知微調，並對微調後模型進行推理。

swift 開源地址:

https://github.com/modelscope/ms-swift

自我認知資料集連結:

https://modelscope.cn/datasets/swift/self-cognition

如果需要使用其他資料集進行微調，只需要修改即可。

自定義 dataset 支援傳入本地路徑、modelscope 和 huggingface 中的 dataset_id 。

文檔可以查看 :

https://swift.readthedocs.io/zh-cn/latest/Instruction/%E8%87%AA%E5%AE%9A%E4%B9%89%E4%B8%8E%E6%8B%9 3%E5%B1%95.html#id3

在開始微調之前，請確保您的環境已正確安裝:

# 安裝 ms-swiftgit clone https://github.com/modelscope/ms-swift.gitcd swiftpip install -e .[llm]

# 微調指令碼:# Experimental environment: A10, 3090, V100, ... # 15GB GPU memory CUDA_VISIBLE_DEVICES=0 swift sft \ --model_type qwen2_5-coder-3b-instruct \ --model_id_or_path qwen/Qwen2.5-Coder-3B-Instruct \ --dataset swift/self-cognition#500 \ AI-ModelScope/Magpie-Qwen2-Pro-200K-Chinese#500 \ AI-ModelScope/Magpie-Qwen2-Pro-200K-English#500 \ --logging_steps 5 \ --max_length 4096 \ --learning_rate 1e-4 \ --output_dir output \ --lora_target_modules ALL \ --model_name 小黃 'Xiao Huang' \ --model_author 魔搭 ModelScope \ --system 'You are a helpful assistant.'

微調視訊記憶體消耗:

微調後推理指令碼如下，這裡的 ckpt_dir 需要修改為訓練生成的 last checkpoint 資料夾。

# Experimental environment: A10, 3090, V100, ... # 直接推理CUDA_VISIBLE_DEVICES=0 swift infer \ --ckpt_dir output/qwen2_5-coder-3b-instruct/vx-xxx/checkpoint-xxx# 使用 vLLM 進行推理加速 CUDA_VISIBLE_DEVICES=0 swift infer \ --ckpt_dir output/qwen2_5-coder-3b-instruct/vx-xxx/checkpoint-xxx \ --infer_backend vllm --max_model_len 8192 --merge_lora true

推理結果:

六、模型應用：Cursor，Artifacts和interpreter

實用的 Coder 一直是的願景，為此本次探索了 Qwen2.5-Coder 在程式碼助手、 Artifacts 、interpreter 場景下的實際表現。

Qwen2.5-Coder 遇到 Cursor:萬能程式碼小助手

只能程式碼助手目前已經得到廣泛的應用，但目前大多依賴閉源模型，希望 Qwen2.5-Coder 的出現能夠為開發者提供一個友好且強大的選擇。

配置 Qwen2.5-Coder-32B-Instruct 的 openai 相容 api(URL 和 API Key)

體驗 Qwen2.5-Coder 強大的生成/編輯/補全能力吧!(Command+K)

Qwen2.5-Coder 遇到Artifacts：prompt程式設計不是夢

Artifacts 是程式碼生成的重要應用之一，幫助使用者創作一些適合視覺化的作品，clone魔搭創空間，即可本地搭建一個Artifacts。

gitclone https://www.modelscope.cn/studios/Qwen/Qwen2.5-Coder-Artifacts.git cd Qwen2.5-Coder-Artifactspipinstall -r requirements.txtpipinstall gradiopython app.py

示例影片：

Qwen2.5-Coder 遇到Interpreter：AI操作電腦

在MAC安裝環境：

pip install open-interpreter

進入Python環境：

from interpreter import interpreterinterpreter.llm.api_base = "YOUR_BASE_URL" interpreter.llm.api_key = "YOUR_API_KEY" interpreter.llm.model = "openai/Qwen-Coder-32B-Instruct" interpreter.chat("Canyou set my system to light mode?")

示例影片：