蜜桃av色欲a片精品一区,麻豆aⅴ精品无码一区二区,亚洲人成网站在线播放影院在线,亚洲 素人 字幕 在线 最新

微立頂科技

新聞資訊

創(chuàng)新 服務(wù) 價(jià)值

  python分割語音端點(diǎn)檢測(cè)

發(fā)布日期:2022/10/23 18:40:16      瀏覽量:

一、語音信號(hào)的分幀處理

語音信號(hào)是時(shí)序信號(hào),其具有長(zhǎng)時(shí)隨機(jī)性和短時(shí)平穩(wěn)性。長(zhǎng)時(shí)隨機(jī)性指語音信號(hào)隨時(shí)間變化是一個(gè)隨機(jī)過程,短時(shí)平穩(wěn)性指在短時(shí)間內(nèi)其特性基本不變,因?yàn)槿苏f話是肌肉具有慣性,從一個(gè)狀態(tài)到另一個(gè)狀態(tài)不可能瞬時(shí)完成。語音通常在10-30ms之間相對(duì)平穩(wěn),因此語音信號(hào)處理的第一步基本都是對(duì)語音信號(hào)進(jìn)行分幀處理,幀長(zhǎng)度一般取10-30ms。

語音信號(hào)的分幀處理通常采用滑動(dòng)窗的方式,窗口可以采用直角窗、Hamming窗等。窗口長(zhǎng)度決定每一幀信號(hào)中包含原始語音信號(hào)中信息的數(shù)量,窗口每次的滑動(dòng)距離等于窗口長(zhǎng)度時(shí),每一幀信息沒有重疊,當(dāng)窗口滑動(dòng)距離小于窗口長(zhǎng)度時(shí)幀信息有重合。本博文采用直角窗進(jìn)行語音信號(hào)的分幀處理:

直角窗:

h(n)={1,0≤n≤N?10,other{\rm{h}}(n) = \left\{ {\begin{matrix}

{1, 0\le n \le N - 1}\\

{0,{\rm{other}}}

\end{matrix}} \right.h(n)={1,0≤n≤N?10,other

二、端點(diǎn)檢測(cè)方法

端點(diǎn)檢測(cè)是指找出人聲開始和結(jié)束的端點(diǎn)。利用人聲信號(hào)短時(shí)特性與非人聲信號(hào)短時(shí)特性的差異可以有效地找出人聲開始和結(jié)束的端點(diǎn),本博文介紹短時(shí)能量和短時(shí)過零率結(jié)合進(jìn)行端點(diǎn)檢測(cè)的方法。

2.1、短時(shí)能量

第n幀信號(hào)的短時(shí)平均能量定義為:

En=∑m=n?N+1n[x(m)w(n?m)]2{E_n} = \sum\limits_{m = n - N + 1}^n {{{\left[ {x\left( m \right)w\left( {n - m} \right)} \right]}^2}}En=m=n?N+1∑n[x(m)w(n?m)]2

包含人聲信號(hào)的幀的短時(shí)平均能量大于非人聲信號(hào)的幀。

2.2、短時(shí)過零率

過零信號(hào)指通過零值,相鄰取樣值改變符號(hào)即過零,過零數(shù)是樣本改變符號(hào)的數(shù)量。

第n幀信號(hào)的平均短時(shí)過零數(shù)為:

Zn=∑m=n?N+1n∣sgn[x(m)]?sgn[x(m?1)]∣w(n?m){Z_n} = \sum\limits_{m = n - N + 1}^n {\left| {{\mathop{\rm sgn}} \left[ {x\left( m \right)} \right] - {\mathop{\rm sgn}} \left[ {x\left( {m - 1} \right)} \right]} \right|w\left( {n - m} \right)}Zn=m=n?N+1∑n∣sgn[x(m)]?sgn[x(m?1)]∣w(n?m)

w(n)={1/(2N),0≤n≤N?10,otherw\left( n \right) = \left\{ {\begin{matrix}

{1/\left( {2N} \right),0 \le n \le N - 1}\\

{0,other}

\end{matrix}} \right.w(n)={1/(2N),0≤n≤N?10,other

三、Python實(shí)現(xiàn)

import wave

import numpy as np

import matplotlib.pyplot as plt

def read(data_path):

’’’讀取語音信號(hào)

’’’

wavepath = data_path

f = wave.open(wavepath,’rb’)

params = f.getparams()

nchannels,sampwidth,framerate,nframes = params[:4] #聲道數(shù)、量化位數(shù)、采樣頻率、采樣點(diǎn)數(shù)

str_data = f.readframes(nframes) #讀取音頻,字符串格式

f.close()

wavedata = np.fromstring(str_data,dtype = np.short) #將字符串轉(zhuǎn)化為浮點(diǎn)型數(shù)據(jù)

wavedata = wavedata * 1.0 / (max(abs(wavedata))) #wave幅值歸一化

return wavedata,nframes,framerate

def plot(data,time):

plt.plot(time,data)

plt.grid(’on’)

plt.show()

def enframe(data,win,inc):

’’’對(duì)語音數(shù)據(jù)進(jìn)行分幀處理

input:data(一維array):語音信號(hào)

wlen(int):滑動(dòng)窗長(zhǎng)

inc(int):窗口每次移動(dòng)的長(zhǎng)度

output:f(二維array)每次滑動(dòng)窗內(nèi)的數(shù)據(jù)組成的二維array

’’’

nx = len(data) #語音信號(hào)的長(zhǎng)度

try:

nwin = len(win)

except Exception as err:

nwin = 1

if nwin == 1:

wlen = win

else:

wlen = nwin

nf = int(np.fix((nx - wlen) / inc) + 1) #窗口移動(dòng)的次數(shù)

f = np.zeros((nf,wlen)) #初始化二維數(shù)組

indf = [inc * j for j in range(nf)]

indf = (np.mat(indf)).T

inds = np.mat(range(wlen))

indf_tile = np.tile(indf,wlen)

inds_tile = np.tile(inds,(nf,1))

mix_tile = indf_tile + inds_tile

f = np.zeros((nf,wlen))

for i in range(nf):

for j in range(wlen):

f[i,j] = data[mix_tile[i,j]]

return f

def point_check(wavedata,win,inc):

’’’語音信號(hào)端點(diǎn)檢測(cè)

input:wavedata(一維array):原始語音信號(hào)

output:StartPoint(int):起始端點(diǎn)

EndPoint(int):終止端點(diǎn)

’’’

#1.計(jì)算短時(shí)過零率

FrameTemp1 = enframe(wavedata[0:-1],win,inc)

FrameTemp2 = enframe(wavedata[1:],win,inc)

signs = np.sign(np.multiply(FrameTemp1,FrameTemp2)) # 計(jì)算每一位與其相鄰的數(shù)據(jù)是否異號(hào),異號(hào)則過零

signs = list(map(lambda x:[[i,0] [i>0] for i in x],signs))

signs = list(map(lambda x:[[i,1] [i<0] for i in x], signs))

diffs = np.sign(abs(FrameTemp1 - FrameTemp2)-0.01)

diffs = list(map(lambda x:[[i,0] [i<0] for i in x], diffs))

zcr = list((np.multiply(signs, diffs)).sum(axis = 1))

#2.計(jì)算短時(shí)能量

amp = list((abs(enframe(wavedata,win,inc))).sum(axis = 1))

# # 設(shè)置門限

# print(’設(shè)置門限’)

ZcrLow = max([round(np.mean(zcr)*0.1),3])#過零率低門限

ZcrHigh = max([round(max(zcr)*0.1),5])#過零率高門限

AmpLow = min([min(amp)*10,np.mean(amp)*0.2,max(amp)*0.1])#能量低門限

AmpHigh = max([min(amp)*10,np.mean(amp)*0.2,max(amp)*0.1])#能量高門限

# 端點(diǎn)檢測(cè)

MaxSilence = 8 #最長(zhǎng)語音間隙時(shí)間

MinAudio = 16 #最短語音時(shí)間

Status = 0 #狀態(tài)0:靜音段,1:過渡段,2:語音段,3:結(jié)束段

HoldTime = 0 #語音持續(xù)時(shí)間

SilenceTime = 0 #語音間隙時(shí)間

print(’開始端點(diǎn)檢測(cè)’)

StartPoint = 0

for n in range(len(zcr)):

if Status ==0 or Status == 1:

if amp[n] > AmpHigh or zcr[n] > ZcrHigh:

StartPoint = n - HoldTime

Status = 2

HoldTime = HoldTime + 1

SilenceTime = 0

elif amp[n] > AmpLow or zcr[n] > ZcrLow:

Status = 1

HoldTime = HoldTime + 1

else:

Status = 0

HoldTime = 0

elif Status == 2:

if amp[n] > AmpLow or zcr[n] > ZcrLow:

HoldTime = HoldTime + 1

else:

SilenceTime = SilenceTime + 1

if SilenceTime < MaxSilence:

HoldTime = HoldTime + 1

elif (HoldTime - SilenceTime) < MinAudio:

Status = 0

HoldTime = 0

SilenceTime = 0

else:

Status = 3

elif Status == 3:

break

if Status == 3:

break

HoldTime = HoldTime - SilenceTime

EndPoint = StartPoint + HoldTime

return StartPoint,EndPoint,FrameTemp1

if __name__ == ’__main__’:

data_path = ’audio_data.wav’

win = 240

inc = 80

wavedata,nframes,framerate = read(data_path)

time_list = np.array(range(0,nframes)) * (1.0 / framerate)

plot(wavedata,time_list)

StartPoint,EndPoint,FrameTemp = point_check(wavedata,win,inc)

checkdata,Framecheck = check_signal(StartPoint,EndPoint,FrameTemp,win,inc)
————————————————
版權(quán)聲明:本文為CSDN博主「weixin_39710106」的原創(chuàng)文章,遵循CC 4.0 BY-SA版權(quán)協(xié)議,轉(zhuǎn)載請(qǐng)附上原文出處鏈接及本聲明。
原文鏈接:https://blog.csdn.net/weixin_39710106/article/details/111444972


  業(yè)務(wù)實(shí)施流程

需求調(diào)研 →

團(tuán)隊(duì)組建和動(dòng)員 →

數(shù)據(jù)初始化 →

調(diào)試完善 →

解決方案和選型 →

硬件網(wǎng)絡(luò)部署 →

系統(tǒng)部署試運(yùn)行 →

系統(tǒng)正式上線 →

合作協(xié)議

系統(tǒng)開發(fā)/整合

制作文檔和員工培訓(xùn)

售后服務(wù)

馬上咨詢: 如果您有業(yè)務(wù)方面的問題或者需求,歡迎您咨詢!我們帶來的不僅僅是技術(shù),還有行業(yè)經(jīng)驗(yàn)積累。
QQ: 39764417/308460098     Phone: 13 9800 1 9844 / 135 6887 9550     聯(lián)系人:石先生/雷先生