語(yǔ)音實(shí)時(shí)交互數(shù)字人,采用ollama的本地Qwen3:4B做LLM的代碼
發(fā)布日期:2025/5/7 7:40:59 瀏覽量:
語(yǔ)音實(shí)時(shí)交互數(shù)字人,采用ollama的本地Qwen3:4B做LLM代碼參考,以及設(shè)置人設(shè)Prompt
import time
import os
import requests
import json
from basereal import BaseReal
from logger import logger
def llm_response(message, nerfreal: BaseReal):
start = time.perf_counter()
# 構(gòu)造包含系統(tǒng)提示和用戶消息的完整prompt
full_prompt = f"""你是一個(gè)樂(lè)于助人的助手。請(qǐng)用中文回答用戶的問(wèn)題。
用戶: {message}
助手:"""
# Prepare the request data for Ollama API
request_data = {
"model": "qwen:4b",
"prompt": full_prompt, # 使用包含系統(tǒng)提示的完整prompt
"stream": True,
"options": {
"temperature": 0.7,
"top_p": 0.9
}
}
end = time.perf_counter()
logger.info(f"llm Time init: {end-start}s")
# Make the request to local Ollama API
response = requests.post(
"http://localhost:11434/api/generate",
json=request_data,
stream=True
)
result = ""
first = True
for line in response.iter_lines():
if line:
# Decode the line and parse the JSON
decoded_line = line.decode("utf-8")
try:
chunk = json.loads(decoded_line)
if "response" in chunk:
msg = chunk["response"]
if first:
end = time.perf_counter()
logger.info(f"llm Time to first chunk: {end-start}s")
first = False
lastpos = 0
for i, char in enumerate(msg):
if char in ",.!;:,。???:;":
result = result + msg[lastpos:i+1]
lastpos = i+1
if len(result) > 10:
logger.info(result)
nerfreal.put_msg_txt(result)
result = ""
result = result + msg[lastpos:]
except json.JSONDecodeError:
logger.error(f"Failed to parse JSON: {decoded_line}")
end = time.perf_counter()
logger.info(f"llm Time to last chunk: {end-start}s")
if result: # Send any remaining text
nerfreal.put_msg_txt(result)
馬上咨詢: 如果您有業(yè)務(wù)方面的問(wèn)題或者需求,歡迎您咨詢!我們帶來(lái)的不僅僅是技術(shù),還有行業(yè)經(jīng)驗(yàn)積累。
QQ: 39764417/308460098 Phone: 13 9800 1 9844 / 135 6887 9550 聯(lián)系人:石先生/雷先生