蜜桃av色欲a片精品一区,麻豆aⅴ精品无码一区二区,亚洲人成网站在线播放影院在线,亚洲 素人 字幕 在线 最新

微立頂科技

新聞資訊

創(chuàng)新 服務 價值

  Qwen 最新開源全鏈路模型:ASR-LLM-知識庫-TTS

發(fā)布日期:2026/3/30 8:54:59      瀏覽量:

Qwen 最新開源全鏈路模型:ASR-LLM-知識庫-TTS



一、最新 Qwen 模型選型(全鏈路、全開源、本地 GPU)

1. ASR 語音識別(最新:Qwen3-ASR)

  • 推薦Qwen3-ASR-1.7B(高精度、多語言、抗噪)
  • 備選:Qwen3-ASR-0.6B(輕量、低延遲、高并發(fā))
  • 下載地址:modelscope://qwen/Qwen3-ASR-1.7B

2. LLM 對話核心(最新:Qwen3.5 系列)

  • 服務器推薦Qwen3.5-9B-Chat(性能強、顯存友好)
  • 輕量版:Qwen3.5-4B-Chat(4B,適合 8G 顯存)
  • 下載地址:modelscope://qwen/Qwen3.5-9B-Chat

3. Embedding 知識庫(最新:Qwen-Embedding-3)

  • 推薦Qwen-Embedding-3-Large(最新、向量質(zhì)量最高)
  • 下載地址:modelscope://qwen/Qwen-Embedding-3-Large

4. TTS 語音合成(最新:Qwen3-TTS)

  • 推薦Qwen3-TTS-1.5B(高保真、流式、支持克?。?
  • 下載地址:modelscope://qwen/Qwen3-TTS-1.5B
二、服務器環(huán)境(GPU 版,一鍵安裝)


# 創(chuàng)建環(huán)境
conda create -n qwen3.5 python=3.10
conda activate qwen3.5

# CUDA 12.1 PyTorch
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

# 依賴(最新版)
pip install transformers==4.43.0 modelscope accelerate bitsandbytes \
faiss-gpu soundfile librosa uvicorn fastapi pyaudio


三、模型一鍵下載(本地 GPU 服務器)

from modelscope import snapshot_download

# ASR
snapshot_download("qwen/Qwen3-ASR-1.7B", local_dir="./models/qwen3_asr")

# LLM
snapshot_download("qwen/Qwen3.5-9B-Chat", local_dir="./models/qwen3.5_9b_chat")

# Embedding
snapshot_download("qwen/Qwen-Embedding-3-Large", local_dir="./models/qwen_emb3")

# TTS
snapshot_download("qwen/Qwen3-TTS-1.5B", local_dir="./models/qwen3_tts")

四、全鏈路代碼(Qwen3.5 + Qwen3 語音,GPU 本地)

import torch
import faiss
import numpy as np
from transformers import AutoTokenizer, AutoModelForCausalLM, AutoModel
from modelscope.pipelines import pipeline
from modelscope.utils.constant import Tasks

# 路徑配置
ASR_PATH = "./models/qwen3_asr"
LLM_PATH = "./models/qwen3.5_9b_chat"
EMB_PATH = "./models/qwen_emb3"
TTS_PATH = "./models/qwen3_tts"
DEVICE = "cuda"

# ---------------------- 1. ASR ----------------------
def qwen3_asr(audio_file):
    print("ASR: Qwen3-ASR-1.7B 識別中...")
    asr_pipe = pipeline(Tasks.automatic_speech_recognition, model=ASR_PATH, device=DEVICE)
    text = asr_pipe(audio_file)["text"]
    print(f"ASR結果: {text}")
    return text

# ---------------------- 2. 知識庫檢索 ----------------------
emb_tokenizer = AutoTokenizer.from_pretrained(EMB_PATH)
emb_model = AutoModel.from_pretrained(EMB_PATH).to(DEVICE).eval()
index = faiss.read_index("knowledge.faiss")

def get_embedding(text):
    inputs = emb_tokenizer(text, return_tensors="pt", truncation=True).to(DEVICE)
    with torch.no_grad():
        emb = emb_model(**inputs).last_hidden_state[:, 0, :]
    return emb.cpu().numpy()

def search_knowledge(query):
    print("知識庫: Qwen-Embedding-3 檢索中...")
    emb = get_embedding(query)
    faiss.normalize_L2(emb)
    _, idx = index.search(emb, 3)
    with open("knowledge_chunks.txt", encoding="utf-8") as f:
        chunks = f.readlines()
    context = "\n".join([chunks[i] for i in idx[0] if i < len(chunks)])
    return context

# ---------------------- 3. LLM ----------------------
llm_tokenizer = AutoTokenizer.from_pretrained(LLM_PATH)
llm_model = AutoModelForCausalLM.from_pretrained(
    LLM_PATH,
    device_map="auto",
    trust_remote_code=True
).eval()

def qwen3.5_llm(query, context):
    print("LLM: Qwen3.5-9B 生成中...")
    prompt = f"參考資料:{context}\n用戶問題:{query}\n請直接回答:"
    messages = [{"role": "user", "content": prompt}]
    inputs = llm_tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True).to(DEVICE)

    with torch.no_grad():
        outputs = llm_model.generate(
            inputs,
            max_new_tokens=1024,
            temperature=0.6,
            top_p=0.7,
            do_sample=True
        )
    answer = llm_tokenizer.decode(outputs[0][len(inputs[0]):], skip_special_tokens=True)
    print(f"LLM回答: {answer}")
    return answer

# ---------------------- 4. TTS ----------------------
def qwen3_tts(text, out="result.wav"):
    print("TTS: Qwen3-TTS-1.5B 合成中...")
    tts_pipe = pipeline(Tasks.text_to_speech, model=TTS_PATH, device=DEVICE)
    wav = tts_pipe(text)["output_wav"]
    import soundfile as sf
    sf.write(out, wav, 16000)
    print(f"TTS完成: {out}")
    return out

# ---------------------- 主流程 ----------------------
def run_digital_human(audio_file):
    user_text = qwen3_asr(audio_file)
    context = search_knowledge(user_text)
    answer = qwen3.5_llm(user_text, context)
    voice = qwen3_tts(answer)
    return answer, voice

if __name__ == "__main__":
    run_digital_human("input.wav")

五、知識庫構建(Qwen-Embedding-3)

from modelscope import AutoTokenizer, AutoModel
import faiss
import torch

model = AutoModel.from_pretrained("./models/qwen_emb3").cuda().eval()
tokenizer = AutoTokenizer.from_pretrained("./models/qwen_emb3")

def encode(texts):
    inputs = tokenizer(texts, padding=True, truncation=True, return_tensors="pt").to("cuda")
    with torch.no_grad():
        emb = model(**inputs).last_hidden_state[:,0,:]
    return emb.cpu().numpy()

# 讀取知識庫
with open("knowledge.txt", encoding="utf-8") as f:
    chunks = [line.strip() for line in f if line.strip()]

# 構建向量庫
embeddings = encode(chunks)
index = faiss.IndexFlatIP(embeddings.shape[1])
faiss.normalize_L2(embeddings)
index.add(embeddings)
faiss.write_index(index, "knowledge.faiss")

# 保存文本塊
with open("knowledge_chunks.txt", "w", encoding="utf-8") as f:
    f.write("\n".join(chunks))

print("? 知識庫構建完成")

六、服務化部署(FastAPI,供數(shù)字人調(diào)用)

from fastapi import FastAPI, UploadFile
app = FastAPI()

@app.post("/digital_human")
async def run(file: UploadFile):
    with open("temp.wav", "wb") as f:
        f.write(await file.read())
    ans, voice = run_digital_human("temp.wav")
    return {"answer": ans, "audio": voice}


啟動:

uvicorn main:app --host 0.0.0.0 --port 10888



  業(yè)務實施流程

需求調(diào)研 →

團隊組建和動員 →

數(shù)據(jù)初始化 →

調(diào)試完善 →

解決方案和選型 →

硬件網(wǎng)絡部署 →

系統(tǒng)部署試運行 →

系統(tǒng)正式上線 →

合作協(xié)議

系統(tǒng)開發(fā)/整合

制作文檔和員工培訓

售后服務

馬上咨詢: 如果您有業(yè)務方面的問題或者需求,歡迎您咨詢!我們帶來的不僅僅是技術,還有行業(yè)經(jīng)驗積累。
QQ: 39764417/308460098     Phone: 13 9800 1 9844 / 135 6887 9550     聯(lián)系人:石先生/雷先生